James Whittaker ，2009至2012年在google工作，担任Engineering Director for Google+ APIs ，负责 dev/test 工具链；2012年2月加入微软。这位老兄在谷歌的时候，写过系列博客文章，从组织结果，角色划分，人员工作职责等各个方面详细的介绍了谷歌是如何来做软件测试的。这些博客在测试领域都引起了相当大的轰动。如下是我对其博客的一些简单总结：
谷歌认为，一个团队如果坚持需要大量专门的测试人员，那一定是什么事情出问题了。拥有一个巨大的测试团队，那一定是开发/测试融合出现了问题，这种情况下，增加更多的测试人员是徒劳的。在谷歌的团队里，全民皆测试，SWE（Software Engineer ）是测试人员，SET（Software Engineer in Test）是测试人员，TE（Test Engineer）也是测试人员。
至于Glenford Myers在其著作《The Art Of Software Testing》中提到的“开发者最好要避免测试他自己的程序”这句箴言，谷歌根本就是置之不理的。SWE是主要的特性编码人员，他除了编码实现功能性能之外，还需要写大量的测试代码来测试其自己的代码，并且参与到small, medium, large等各级规模的测试中；SET是谷歌另一个有趣的岗位，他也是开发人员，但是主要关注可测试性实现上。
Tuesday, January 25, 2011
By James Whittaker
This is the first in a series of posts on this topic.
The one question I get more than any other is "How does Google test?" It's been explained in bits and pieces on this blog but the explanation is due an update. The Google testing strategy has never changed but the tactical ways we execute it has evolved as the company has evolved. We're now a search, apps, ads, mobile, operating system,and so on and so forthcompany. Each of these Focus Areas (as we call them) have to do things that make sense for their problem domain. As we add new FAs and grow the existing ones, our testing has to expand and improve. What I am documenting in this series of posts is a combination of what we are doing today and the direction we are trending toward in the foreseeable future.
Let's begin with organizational structure and it's one that might surprise you. There isn't an actual testing organization at Google. Test exists within a Focus Area called Engineering Productivity. Eng Prod owns any number of horizontal and vertical engineering disciplines, Test is the biggest. In a nutshell, Eng Prod is made of:
1. Aproduct teamthat produces internal and open source productivity tools that are consumed by all walks of engineers across the company. We build and maintain code analyzers, IDEs, test case management systems, automated testing tools, build systems, source control systems, code review schedulers, bug databases... The idea is to make the tools that make engineers more productive. Tools are a very large part of the strategic goal of prevention over detection.
2. Aservices teamthat provides expertise to Google product teams on a wide array of topics including tools, documentation, testing, release management, training and so forth. Our expertise covers reliability, security, internationalization, etc., as well as product-specific functional issues that Google product teams might face. Every other FA has access to Eng Prod expertise.
3.Embedded engineersthat are effectively loaned out to Google product teams on an as-needed basis. Some of these engineers might sit with the same product teams for years, others cycle through teams wherever they are needed most. Google encourages all its engineers to change product teams often to stay fresh, engaged and objective. Testers are no different but the cadence of changing teams is left to the individual. I have testers on Chrome that have been there for several years and others who join for 18 months and cycle off. Keeping a healthy balance between product knowledge and fresh eyes is something a test manager has to pay close attention to.
So this means that testers report to Eng Prod managers but identify themselves with a product team, like Search, Gmail or Chrome. Organizationally they are part of both teams. They sit with the product teams, participate in their planning, go to lunch with them, share in ship bonuses and get treated like full members of the team. The benefit of the separate reporting structure is that it provides a forum for testers to share information. Good testing ideas migrate easily within Eng Prod giving all testers, no matter their product ties, access to the best technology within the company.
This separation of project and reporting structures has its challenges. By far the biggest is that testers are an external resource. Product teams can't place too big a bet on them and must keep their quality house in order. Yes, that's right: at Google it's the product teams that own quality, not testers. Every developer is expected to do their own testing. The job of the tester is to make sure they have the automation infrastructure and enabling processes that support this self reliance. Testers enable developers to test.
What I like about this strategy is that it puts developers and testers on equal footing. It makes us true partners in quality and puts the biggest quality burden where it belongs: on the developers who are responsible for getting the product right. Another side effect is that it allows us a many-to-one dev-to-test ratio. Developers outnumber testers. The better they are at testing the more they outnumber us. Product teams should be proud of a high ratio!
Ok, now we're all friends here right? You see the hole in this strategy I am sure. It's big enough to drive a bug through. Developers can't test! Well, who am I to deny that? No amount of corporate kool-aid could get me to deny it, especially coming offmy GTAC talklast year where I pretty much made a game of developer vs. tester (spoiler alert: the tester wins).
Google's answer is to split the role. We solve this problem by having two types of testing roles at Google to solve two very different testing problems. In my next post, I'll talk about these roles and how we split the testing problem into two parts.
Wednesday, February 09, 2011
By James Whittaker
In order for the “you build it, you break it” motto to be real, there are roles beyond the traditional developer that are necessary. Specifically, engineering roles that enable developers to do testing efficiently and effectively have to exist. At Google we have created roles in which some engineers are responsible for making others more productive. These engineers often identify themselves as testers but their actual mission is one of productivity. They exist to make developers more productive and quality is a large part of that productivity. Here's a summary of those roles:
TheSWEorSoftware Engineeris the traditional developer role. SWEs write functional code that ships to users. They create design documentation, design data structures and overall architecture and spend the vast majority of their time writing and reviewing code. SWEs write a lot of test code including test driven design, unit tests and, as we explain in future posts, participate in the construction of small, medium and large tests. SWEs own quality for everything they touch whether they wrote it, fixed it or modified it.
TheSETorSoftware Engineer in Testis also a developer role except their focus is on testability. They review designs and look closely at code quality and risk. They refactor code to make it more testable. SETs write unit testing frameworks and automation. They are a partner in the SWE code base but are more concerned with increasing quality and test coverage than adding new features or increasing performance.
TheTEorTest Engineeris the exact reverse of the SET. It is a a role that puts testing first and development second. Many Google TEs spend a good deal of their time writing code in the form of automation scripts and code that drives usage scenarios and even mimics a user. They also organize the testing work of SWEs and SETs, interpret test results and drive test execution, particular in the late stages of a project as the push toward release intensifies. TEs are product experts, quality advisers and analyzers of risk.
From a quality standpoint, SWEs own features and the quality of those features in isolation. They are responsible for fault tolerant designs, failure recovery, TDD, unit tests and in working with the SET to write tests that exercise the code for their feature.
SETs are developers that provide testing features. A framework that can isolate newly developed code by simulating its dependencies with stubs, mocks and fakes and submit queues for managing code check-ins. In other words, SETs write code that allows SWEs to test their features. Much of the actual testing is performed by the SWEs, SETs are there to ensure that features are testable and that the SWEs are actively involved in writing test cases.
Clearly SETs primary focus is on the developer. Individual feature quality is the target and enabling developers to easily test the code they write is the primary focus of the SET. This development focus leaves one large hole which I am sure is already evident to the reader: what about the user?
User focused testing is the job of the Google TE. Assuming that the SWEs and SETs performed module and feature level testing adequately, the next task is to understand how well this collection of executable code and data works together to satisfy the needs of the user. TEs act as a double-check on the diligence of the developers. Any obvious bugs are an indication that early cycle developer testing was inadequate or sloppy. When such bugs are rare, TEs can turn to their primary task of ensuring that the software runs common user scenarios, is performant and secure, is internationalized and so forth. TEs perform a lot of testing and test coordination tasks among TEs, contract testers, crowd sourced testers, dog fooders, beta users, early adopters. They communicate among all parties the risks inherent in the basic design, feature complexity and failure avoidance methods. Once TEs get engaged, there is no end to their mission.
Ok, now that the roles are better understood, I'll dig into more details on how we choreograph the work items among them. Until next time...thanks for your interest.
Wednesday, February 16, 2011
By James Whittaker
Lots of questions in the comments to the last two posts. I am not ignoring them. Hopefully many of them will be answered here and in following posts. I am just getting started on this topic.
At Google, quality is not equal to test. Yes I am sure that is true elsewhere too. “Quality cannot be tested in” is so cliché it has to be true. From automobiles to software if it isn’t built right in the first place then it is never going to be right. Ask any car company that has ever had to do a mass recall how expensive it is to bolt on quality after-the-fact.
However, this is neither as simple nor as accurate as it sounds. While it is true that quality cannot be tested in, it is equally evident that without testing it is impossible to develop anything of quality. How does one decide if what you built is high quality without testing it?
The simple solution to this conundrum is to stop treating development and test as separate disciplines. Testing and development go hand in hand. Code a little and test what you built. Then code some more and test some more. Better yet, plan the tests while you code or even before. Test isn’t a separate practice, it’s part and parcel of the development process itself. Quality is not equal to test; it is achieved by putting development and testing into a blender and mixing them until one is indistinguishable from the other.
At Google this is exactly our goal: to merge development and testing so that you cannot do one without the other. Build a little and then test it. Build some more and test some more. The key here is who is doing the testing. Since the number of actual dedicated testers at Google is so disproportionately low, the only possible answer has to be the developer. Who better to do all that testing than the people doing the actual coding? Who better to find the bug than the person who wrote it? Who is more incentivized to avoid writing the bug in the first place? The reason Google can get by with so few dedicated testers is because developers own quality. In fact, teams that insist on having a large testing presence are generally assumed to be doing something wrong. Having too large a test team is a very strong sign that the code/test mix is out of balance. Adding more testers is not going to solve anything.
This means that quality is more an act of prevention than it is detection. Quality is a development issue, not a testing issue. To the extent that we are able to embed testing practice inside development, we have created a process that is hyper incremental where mistakes can be rolled back if any one increment turns out to be too buggy. We’ve not only prevented a lot of customer issues, we have greatly reduced the number of testers necessary to ensure the absence of recall-class bugs. At Google, testing is aimed at determining how well this prevention method is working. TEs are constantly on the lookout for evidence that the SWE-SET combination of bug writers/preventers are screwed toward the latter and TEs raise alarms when that process seems out of whack.
Manifestations of this blending of development and testing are all over the place from code review notes asking ‘where are your tests?’ to posters in the bathrooms reminding developers about best testing practices, our infamous Testing On The Toilet guides. Testing must be an unavoidable aspect of development and the marriage of development and testing is where quality is achieved. SWEs are testers, SETs are testers and TEs are testers.
If your organization is also doing this blending, please share your successes and challenges with the rest of us. If not, then here is a change you can help your organization make: get developers fully vested in the quality equation. You know the old saying that chickens are happy to contribute to a bacon and egg breakfast but the pig is fully committed? Well, it's true...go oink at one of your developer and see if they oink back. If they start clucking, you have a problem.
Wednesday, March 02, 2011
By James Whittaker
Crawl, walk, run.
One of the key ways Google achieves good results with fewer testers than many companies is that we rarely attempt to ship a large set of features at once. In fact, the exact opposite is often the goal: build the core of a product and release it themoment it is usefulto as large a crowd as feasible, then get their feedback and iterate. This is what we did with Gmail, a product that kept its beta tag for four years. That tag was our warning to users that it was still being perfected. We removed the beta tag only when we reached our goal of 99.99% uptime for a real user’s email data. Obviously, quality is a work in progress!
It’s not as cowboy a process as I make it out to be. In fact, in order to make it to what we call the beta channel release, a product must go through a number of other channels and prove its worth. For Chrome, a product I spent my first two years at Google working on, multiple channels were used depending on our confidence in the product’s quality and the extent of feedback we were looking for. The sequence looked something like this:
Canary Channelis used for code we suspect isn’t fit for release. Like a canary in a coalmine, if it failed to survive then we had work to do. Canary channel builds are only for the ultra tolerant user running experiments and not depending on the application to get real work done.
Dev Channelis what developers use on their day-to-day work. All engineers on a product are expected to pick this build and use it for real work.
Test Channelis the build used for internal dog food and represents a candidate beta channel build given good sustained performance.
TheBeta ChannelorRelease Channelbuilds are the first ones that get external exposure. A build only gets to the release channel after spending enough time in the prior channels that is gets a chance to prove itself against a barrage of both tests and real usage.
This crawl, walk, run approach gives us the chance to run tests and experiment on our applications early and obtain feedback from real human beings in addition to all the automation we run in each of these channels every day.
There are analytical benefits to this process as well. If a bug is found in the field a tester can create a test that reproduces it and run it against builds in each channel to determine if a fix has already been implemented.
Wednesday, March 23, 2011
By James Whittaker
Instead of distinguishing between code, integration and system testing, Google uses the language of small, medium and large tests emphasizing scope over form. Small tests cover small amounts of code and so on. Each of the three engineering roles may execute any of these types of tests and they may be performed as automated or manual tests.
Small Testsare mostly (but not always) automated and exercise the code within a single function or module. They are most likely written by a SWE or an SET and may require mocks and faked environments to run but TEs often pick these tests up when they are trying to diagnose a particular failure. For small tests the focus is on typical functional issues such as data corruption, error conditions and off by one errors. The question a small test attempts to answer is does this code do what it is supposed to do?
Medium Testscan be automated or manual and involve two or more features and specifically cover the interaction between those features. I've heard any number of SETs describe this as "testing a function and its nearest neighbors." SETs drive the development of these tests early in the product cycle as individual features are completed and SWEs are heavily involved in writing, debugging and maintaining the actual tests. If a test fails or breaks, the developer takes care of it autonomously. Later in the development cycle TEs may perform medium tests either manually (in the event the test is difficult or prohibitively expensive to automate) or with automation. The question a medium test attempts to answer is does a set of near neighbor functions interoperate with each other the way they are supposed to?
Large Testscover three or more (usually more) features and represent real user scenarios to the extent possible. There is some concern with overall integration of the features but large tests tend to be more results driven, i.e., did the software do what the user expects? All three roles are involved in writing large tests and everything from automation to exploratory testing can be the vehicle to accomplish accomplish it. The question a large test attempts to answer is does the product operate the way a user would expect?
The actual language of small, medium and large isn’t important. Call them whatever you want. The important thing is that Google testers share a common language to talk about what is getting tested and how those tests are scoped. When some enterprising testers began talking about a fourth class they dubbedenormousevery other tester in the company could imagine a system-wide test covering nearly every feature and that ran for a very long time. No additional explanation was necessary.
The primary driver of what gets tested and how much is a very dynamic process and varies wildly from product to product. Google prefers to release often and leans toward getting a product out to users so we can get feedback and iterate. The general idea is that if we have developed some product or a new feature of an existing product we want to get it out to users as early as possible so they may benefit from it. This requires that we involve users and external developers early in the process so we have a good handle on whether what we are delivering is hitting the mark.
Finally, the mix between automated and manual testing definitely favors the former for all three sizes of tests. If it can be automated and the problem doesn’t require human cleverness and intuition, then it should be automated. Only those problems, in any of the above categories, which specifically require human judgment, such as the beauty of a user interface or whether exposing some piece of data constitutes a privacy concern, should remain in the realm of manual testing.
Having said that, it is important to note that Google performs a great deal of manual testing, both scripted and exploratory, but even this testing is done under the watchful eye of automation. Industry leading recording technology converts manual tests to automated tests to be re-executed build after build to ensure minimal regressions and to keep manual testers always focusing on new issues. We also automate the submission of bug reports and the routing of manual testing tasks. For example, if an automated test breaks, the system determines the last code change that is the most likely culprit, sends email to its authors and files a bug. The ongoing effort to automate to within the “last inch of the human mind” is currently the design spec for the next generation of test engineering tools Google is building.
Those tools will be highlighted in future posts. However, my next target is going to revolve around The Life of an SET. I hope you keep reading.
Monday, May 02, 2011
By James Whittaker
The Life of an SET
SETs are Software Engineers in Test. They are software engineers who happen to write testing functionality. First and foremost, SETs are developers and the role is touted as a 100% coding role in our recruiting literature and internal job promotion ladders. When SET candidates are interviewed, the “coding bar” is nearly identical to the SWE role with more emphasis that SETs know how to test the code they create. In other words, both SWEs and SETs answer coding questions. SETs are expected to nail a set of testing questions as well.
As you might imagine, it is a difficult role to fill and it is entirely possible that the low numbers of SETs isn’t because Google has created a magic formula for productivity but more of a result of adapting our engineering practice around the reality that the SET skill set is really hard to find. We optimize on this very important task and build processes around the people who do it.
It is usually the case that SETs are not involved early in the design phase. Their exclusion is not so much purposeful as it is a by-product of how a lot of Google projects are born. A common scenario for new project creation is that some informal 20% effort takes a life of its own as an actual Google branded product. Gmail and Chrome OS are both projects that started out as ideas that were not formally mandated by Google but over time grew into shipping products with teams of developers and testers working on them. In such cases early development is not about quality, it is about proving out a concept and working on things like scale and performance that must be right before quality could even be an issue. If you can't build a web service that scales, testing is not your biggest problem!
Once it is clear that a product can and will be built and shipped, that's when the development team seeks out test involvement.
You can imagine a process like this: someone has an idea, they think about it, write experimental code, seek out opinions of others, write some more code, get others involved, write even more code, realize they are onto something important, write more code to mold the idea into something that they can present to others to get feedback ... somewhere in all this an actual project is created in Google's project database and the project becomes real. Testers don't get involved until it becomes real.
Do all real projects get testers? Not by default. Smaller projects and those meant for limited users often get tested exclusively by the people who build it. Others that are riskier to our users or the enterprise (much more about risk later) get testing attention.
The onus is on the development teams to solicit help from testers and convince them that their project is exciting and full of potential. Dev Directors explain their project, progress and ship schedule to Test Directors who then discuss how the testing burden is to be shared and agree on things like SWE involvement in testing, expected unit testing levels and how the duties of the release process are going to be shared. SETs may not be involved at project inception, but once the project becomes real we have vast influence over how it is to be executed.
And when I say "testing" I don't just mean exercising code paths. Testers might not be involved from the beginning ... buttestingis. In fact, an SET's impact is felt even before a developer manages to check code into the build. Stay tuned to understand what I am talking about.
Thursday, May 26, 2011
By James Whittaker
The Life of a TE
The Test Engineer is a newer role within Google than either SWEs or SETs. As such, it is a role still in the process of being defined. The current generation of Google TEs are blazing a trail which will guide the next generation of new hires for this role. It is the process that is emerging as the best within Google that we present here.
Not all products require the services of a TE. Experimental efforts and early stage products without a well-defined mission or user story are certainly projects that won’t get a lot of TE attention. If the product stands a good chance of being cancelled (in the sense that as a proof of concept it fails to pass muster) or has yet to engage users or have a well defined set of features, testing it is largely something that should be done by the people developing it.
Even if it is clear that a product is going to get shipped, Test Engineers have little to do early in the development cycle when features are still in flux and the final feature list and scope is undetermined. Overinvesting in testing too early can mean a lot of things get thrown away. Likewise, early testing planning requires fewer test engineers than later cycle exploratory testing when the product is close to final form and the hunt for missed bugs has a greater urgency.
The trick in staffing a project with Test Engineers has to do with risk and return on investment. Risk to the customer and to the enterprise means more testing effort and requires more TEs. But that effort needs to be in proportion to the potential return. We need the right number of TEs and we need them to engage at the right time and with the right impact.
Once engaged, TEs do not have to start from scratch. There is a great deal of test engineering and quality-oriented work performed by SWEs and SETs which is the starting point for additional TE work. The initial engagement of the TE is to decide things such as:
· Where are the weak points in the software?
· What are the security, privacy, performance and reliability concerns?
· Do all the primary user scenarios work as expected? For all international audiences?
· Does the product interoperate with other products (hardware and software)?
· In the event of a problem, how good are the diagnostics?
All of this combines to speak to the risk profile of releasing the software in question. TEs don’t necessarily do all of this work, but they ensure that it gets done and they leverage the work of others is assessing where additional work is required. Ultimately, test engineers are paid to protect users and the business from bad design, confusing UX, functional bugs, security and privacy issues and so forth. At Google, TEs are the only people on a team whose full-time job is to look at the product or service holistically for weak points. As such, the life of a Test Engineer is much less prescriptive and formalized than that of an SET. TE’s are asked to help on projects in all stages of readiness: everything from the idea stage to version 8, or even watching over a deprecated or “mothballed” project. Often, a single TE will even span multiple projects particularly those with specialty type skills like security.
Obviously, the work of a TE varies greatly depending on the project. Some TE’s spend much of their time programming, much like an SET, but with more of a focus on end-to-end user scenarios. Other TE's take existing code and designs determine failure modes and look for errors that will cause those failures. In such a role a TE might modify code but not create it from scratch. TE's must be more systematic and thorough in their test planning and completeness with a focus on the actual usage and system experience. TE's excel at dealing with ambiguity in requirements and at reasoning and communicating about fuzzy problems.
Successful TEs accomplish all this while navigating the sensitivities and sometimes strong personalities of the development and product team members. When weak points are found, test engineers happily break the software, and drive to get these issues resolved with the SWEs, PMs, and SETs.
Such a job description is a frightening prospect given the mix of technical skill, leadership, and deep product understanding and without proper guidance it is a role in which many would expect to fail. But at Google a strong community of test engineers has emerged to counter this. Of all job functions, the TE role is perhaps the best peer supported role in the company and the insight and leadership required to perform it successfully means that many of the top test managers in the company come from the TE ranks.
There is a fluidity to the work of a Google Test Engineer that belies any prescriptive process for engagement. TE’s can enter a project at any point and must assess the state of the project, code, design, and users quickly and decide what to focus on first. If the project is just getting started, test planning is often the first order of business. Sometimes TEs are pulled in late in the cycle to evaluate whether a project is ready for ship or if there are any major issues before an early ‘beta’ goes out. If they are brought into a newly acquired application or one in which they have little prior experience, they will often start doing some exploratory testing with little to no planning. Sometimes projects haven’t been released for quite a while and just need some touchups/security fixes, or UX updates—calling for an even different approach. One size rarely fits all for TEs at Google.
Tuesday, February 22, 2011
By James Whittaker
These posts have garnered a number of interesting comments. I want to address two of the negative ones in this post. Both are of the same general opinion that I am abandoning testers and that Google is not a nice place to ply this trade. I am puzzled by these comments because nothing could be further from the truth. One such negative comment I can take as a one-off but two smart people (hey they are reading this blog, right?) having this impression requires a rebuttal. Here are the comments:
"A sad day for testers around the world. Our own spokesman has turned his back on us. What happened to 'devs can't test'?" by Gengodo
"I am a test engineer and Google has been one of my dream companies. Reading your blog I feel that Testers are so unimportant at Google and can be easily laid off. It's sad." by Maggi
First of all, I don't know of any tester or developer for that matter being laid off from Google. We're hiring at a rapid pace right now. However, we do change projects a lot so perhaps you read 'taken off a project' to mean something far worse than the reality of just moving to another project. A tester here may move every couple of years or so and it is a badge of honor to get to the point where you've worked yourself out of a job by building robust test frameworks for others to contribute tests to or to pass off what you've done to a junior tester and move on to a bigger challenge. Maggi, please keep the dream alive. If Google was a hostile place for testers, I would be working somewhere else.
Second, I am going to dodge the negative undertones of the developer vs tester debate. Whether developers can test or testers can code seems downright combative. Both types of engineers share the common goal of shipping a product that will be successful. There is enough negativity in this world and testers hating developers seems so 2001.
In fact, I feel a confession coming on. I have had sharp words with developers in the past. I have publicly decried the lack of testing rigor in commercial products. If you've seen me present you've probably witnessed me showing colorful bugs, pointing to the screen and shouting "you missed a spot!" I will admit, that was fun.
Here are some other quotes I have directed at developers:
"You must be smarter than me because I couldn't write this bug if I was trying to."
"What happened, did the compiler get your eye?"
"What do you say to a developer with two black eyes? Nothing, he's already been told twice."
"Did you hear about the developer who locked himself in his car?"
Ah, those were the good old days! But it's 2011 now and I am objective enough to give developers credit when they step up to the plate and do their job. At Google, many have and they are helping to shame the rest into following suit. And this is making bugs harder to find. I waste so little time on low hanging fruit that I get to dig deeper to find the really subtle, really critical bugs. The signal to noise ratio is just a whole lot stronger now. Yes there are fewer developer jokes but this is progress. I have to make myself feel good knowing how many bugs have been prevented instead of how many laughs I can get on stage demonstrating their miserable failures.
This is progress.
And, incidentally developers can test. In some cases far better than testers. Modern testing is about optimizing the places where developers test and where testers test. Getting that mix right means a great product. Getting it wrong puts us back in 2001 where my presentations were a heck of a lot funnier.
In what cases are developers better testers that we are? In what cases are they not only poor testers but we're better off not having them touch the product at all? Well, that's the subject of my next couple of posts. In the meantime...