Thursday, August 27, 2009
Knowledge Acquision Plan
Wednesday, August 26, 2009
Why Twitter, again?
Tuesday, August 25, 2009
Manager's Appraisal
Today the normal procedure is that managers perform an appraisal procedure for their subordinates at least once a year. Promotions, demotions and salary updates are normally based on the results of this appraisal. My question is who performs appraisal for managers? An obvious answer is their manager up to CEO, who is accountable to the board of directors. The problem with this answer is here their manager will perform their appraisal as subordinates. And my question is who will assess their work as managers? In other words who decides whether they are good managers or not? So call 180 degree appraisal in my experience seldom works: who will take a risk to say something really serious to their managers?
Before we can reason about who decides whether the manager is good or bad there is simpler question to be addressed: what does it mean to be a good manager and for whom? Good for whom? Presumably for the business organization (the Company). In other words those managers are good who contribute to fulfilling the Company mission proportionally to their compensation. What is the Company mission? Too many people still believe that it's to maximize the shareholder value. In other words to protect private and public funds invested in the company. In reality it's not that simple. As P. Drucker claims to be profitable and at large is merely a price to stay in the business on the long haul. If so that what is the long term mission of the Company? The most leading experts follow the P. Drucker premise that the Company mission is to ensure long term wealth generation and personal growth for their local communities. How is it related to the current globalization trend is a separate story (perhaps another blog).
Here is a short list of what the Manager has to do in order to contribute to the Company long term mission:
1. To ensure that employees the Manager is responsible for have ALL necessary conditions for productive work (salary, desk, chair and computer are only part of it, see A. Maslow needs hierarchy)
2. To ensure steady extending of the company market share and development of new products and services
3. To set high quality and productivity standards, which will keep the Company above competition
4. To get rid employees, who by this way or another violate #1, #2, #3
The common belief is that the Manager has to keep his subordinate accountable. That's true, but without ensuring right conditions that would turn into permanent intimidation, micro-management, and the worst - the loss of the best people, who will just decide to try their luck elsewhere.
Today perspectives of personal growth constitute a lion share of employee work conditions package. Times when manual workers performed dirty job they hated for some money to put food on their tables are over. At least at the high tech industry. Now in order to survive the Company needs the best people it can attract. But the best people normally want to be better, they want to grow. They just spend too much time at office to afford otherwise.
To be able to grow we have to work with right people. Practical experience constitutes about 90% of the personal growth and we normally learn from each other. For that reason #4 is so important: we have to get rid of "wrong" people not just because they do not justify their salary, but because they deprive the "right" people from normal working conditions.
Who are the right people? Who can decide? Managers, for sure, who by themselves should be the right people. Sounds like a egg and chicken problem. In a sense it is, but here are three basic traits I would suggest to consider:
1. Mental health. Korzybsky and Maslow give some very good insights. Too much needs to be said here (another blog, sigh). In short, too knowledgeable engineer, who has serious communication and cooperation problems, might be not the best choice (We need geeks, who are socially responsible - Kent Beck).
2. Ability and willingness to learn fast technical and non-technical stuff. Every company has enough technology and corporate politics specifics, which everyone just needs to know. The faster the better.
3. Good, preferably wide, engineering and general (art, music, science) education. People coming for outside could very often enrich the Company technology and process portfolio by just bringing another perspective.
Who should perform this appraisal I still do not know, but at least there is an initial check list to start with. Good managers might be able to do it for themselves.
Monday, August 24, 2009
Choose Your Process: Waterfall, RUP, Agile
Agile Testing
"People who look for easy money invariably pay for the privilege of proving conclusively that it cannot be found on this earth."
Jesse Livermore, “Reminiscences of a Stock Operator”
Introduction
"Easy money" does not exist in the software business, just as it does not exist on the stock market. Those, who think otherwise usually, pay a hefty fee to be proven wrong. Any hope of solving software quality problems through a numbers crunching campaign, without a serious study of the nature of the problem, is futile, and will most likely make matters even worse than they were. One does not hope to get a Wimbledon medal without spending years playing tennis. Why should it be different for software?
Quality and time-to-market are the most important business success factors of any business in general and of high-tech business in particular. When given a choice, customers will not tolerate poor service and shoddy products - they will switch to another vendor. Quality is perhaps the most profitable investment of time and money, but it does not come free. One needs to learn how to achieve it in a cost-effective and pragmatic way.
Let’s see how Agile folks are solving the software quality problems and why it might work with enough time and energy invested in learning how to apply these techniques properly.
What is Agile Testing?
There is a wide spread understanding that testing means that one produces something and somebody else checks that there are no mistakes. This is a false impression.
Edwards Deming, the spiritual father of modern quality assurance, makes it crystal clear: “Cease dependence on mass inspection to achieve quality. Improve the process and build quality into the product in the first place”. Edwards Deming, the father of the Japanese industrial revolution, formulated his famous 14 management points in the early fifties of the last century. By the end of the century, his ideas were widely adopted by the Agile Approach proponents in order to ensure the software quality in the first place.
Briefly, Agile Testing means specifying tests before development starts. These tests are run automatically by developers and by continuous integration servers as many times as required in order to ensure that what we think we developed is actually what we developed. Unless the software passes all the tests, the corresponding feature is not considered "ready."
Those who are capable of specifying tests in advance, fill the role of domain and quality assurance experts. Usually these are the most experienced members of the team. Needless to say, the very concept of blue-collar testers just does not exist in the agile team.
Now let’s try to understand what’s wrong with traditional, post-development testing.
What’s Wrong With Traditional Testing?
Delaying tests until the end of software development means serialization of the process. Serialization means very long delivery times. Long delivery times in turn means that the risk of developing a technically perfect, but completely useless system is too high. In order to ensure that we develop what is really required we have to get feedback as early as possible. When development iterations are shortened to, for example, one month the post-development testing eats up most of the iteration’s time budget, not leaving enough time for developers to develop quality code.
Frankly, if you find many defects by the end of the iteration, what can you do about it? It’s too late to change the code.
Therefore:
- All critical acceptance tests must be ready before development starts.
- All team members claim personal responsibility for the code quality.
There is no place for the kind of “we write code, they will find bugs in it” adversely Development-QC relationships. The whole team either succeeds or fails together.
The Importance of Test Automation
Even in traditional testing, tests usually are specified in advance, in the form of an Acceptance Test Plan document. So what’s the difference with Agile Testing? The main difference is that ATP is usually a document, and by definition it cannot be run automatically. If it’s possible at all, somebody at some point in time will convert this document into a script to run regression tests automatically. The problem with this approach is that those who specify tests cannot always verify the script, and those who write the scripts, cannot always understand the domain well enough, and thus might introduce subtle mistakes while converting the ATP document into scripts. For non-trivial domains the probability of mistakes grows enormously (and where is profit in simple domains?).
Fully automated regression tests are also often created only when the system is already developed. In fact, these tests are just a reverse engineering of what the system is doing, not what it is supposed to do.
In the case of User Interface intensive applications (e.g. the EPG), automatic regression tests created this way come very late, are too sensitive to even the slightest changes in the GUI, and in general, are not cost-effective.
If tests are not automated, but instead performed manually, the project will slow down to a crawl. The more features that have been developed, the more regression tests are required. Humans are very bad at performing repetitive mechanical tasks. Therefore, not having automated regression tests means severely limiting the project throughput.
Without thorough regression tests, we cannot guarantee backward compatibility, which in turn means we cannot deliver the next version from the main trunk of our source control to existing customers. As a result, multiple release and/or customer-specific branches will flourish in the version control system, and the overall maintenance cost will increase significantly.
Not having automatic regression tests also means we cannot re-factor our existing code base in order to adopt it to new requirements and improve its general quality. As a result, the code will very soon reach the “don’t touch me” status, and its quality will continuously deteriorate with any bug fix or change request.
Which Kind of Tests?
The Agile philosophy distinguishes between two basic types of automatic tests: Integrated and Unit tests. Integrated tests can be subdivided into Acceptance Tests, Endurance Tests, and Stress Tests.
To produce quality software, one has not only learn about each individual testing technique, but also to acquire a clear understanding of how these techniques complement each other, and why it’s so important to apply all of them in the right proportion.
Integrated Tests: Acceptance Tests
Acceptance tests should be specified before development starts in a format that enables automation. The scope of Acceptance Tests might vary from an end-to-end system to a single component. Acceptance tests are specified in the form of HTML tables called fixtures, and are run using Framework for Integrated Tests (FIT) or one of its extensions: FitLibrary or Fitnesse.
The FitLibrary extends the original FIT with additional types of fixture tables, while Fitnesse wraps it with a Wiki site in order to facilitate collaboration in acceptance tests specification. Fitnesse also has some handy tools for large acceptance test suite management.
There are three basic categories of fixture tables:
- to setup initial pre-conditions
- to exercise some system functionality
- to verify post-conditions
The fixture table names are automatically mapped onto underlying programming language class names. The main versions of FIT, FitLibrary and Fitnesse are developed in Java and then ported to other programming languages: C#, Python, Ruby and even C++. I’ve found the C++ version to be not user-friendly, and for Integrated Acceptance Tests of C/C++ modules we use a special integration between bmock library and Java (so-called bmock console mode).
Acceptance Tests specified using FIT or some of its flavors are extremely powerful for dealing with large permutations of input values: parental rating strings formatting, error message prioritization, lengthy events sequences and complex state machines.
A critical trait of Acceptance Tests is that they are supposed to be run using only the core system under test, without involving any elements of the real environment: GUI, databases, file system, heavy communication protocols (e.g. FTP). The reason for this is that we want our acceptance tests to be fully controlled and to run very quickly (we will have a lot of them). Dealing with the real environment (e.g. real STB) will typically slow down the process significantly, and make it more complex. Using Acceptance Tests leads to a better, more modular design.
Acceptance Tests do not guarantee a high percentage of code coverage. The reason for this is that test complexity grows exponentially, and any attempt to cover all possible edge cases of all possible scenarios would lead to unmanageably large test suite. Achieving 100% coverage of lines of code is the goal of unit testing.
Acceptance Tests do ensure a proper functionality of the system, but do not guarantee its proper structure and long term maintainability. One has to combine the both Integrated and Unit tests in order to achieve required code quality level.
Other Types of Integrated Tests
- Endurance Tests are intended to validate that the system under test will work a certain number of hours without crashing. More specifically, these tests validate that there are no resource (memory, file handles, sockets, etc.) leakages in the system.
- Stress Tests are intended to validate the system’s throughput and latency for a certain workload (number of concurrent users). Formally speaking, the system latency is the function of the system throughput and the length of the internal queue. For single user systems like the EPG, stress tests goals typically need to be defined more specially.
Both types of Integrated Tests are usually run in environment which is as close to the real one as possible. In order to cope with test scenario complexity, we have to keep the variability of this kind of integrated tests within a limited scope - only a small number of selected and fully specified test scenarios.
Unit Tests - 100% Line Coverage
Unit tests provide a test for each branch or each method (or function) of each class (or module). In order to avoid exponential blow up of the tests' complexity, the system under test should be properly modularized.
Within the scope of particular class or module unit tests, all classes or modules, it depends on, will be replaced by mocks. There are unit testing and mock objects frameworks for each popular programming language: JUnit and EasyMock (or JMock) for Java, NUnit and NMock for C#, Boost.Test and bmock (developed by this author) for C/C++, etc.
The goal of unit testing is to achieve 100% line coverage. Only then one can be confident that all possible edge cases will not provide any unpleasant surprises in the real environment. This is usually possible only through a proper modularization of the system and using mock objects: there are types of edge cases which are virtually irreproducible in the real environment.
For some low-level rare edge cases (e.g., no memory) specific requirements do not exist. As long as the system behaves reasonably well and does not crash, any edge case handling mechanism would be ok. Applying this technique could reduce the size of Acceptance Tests suite substantially.
Unit testing comes to its full advantage only when it’s combined with test coverage measurement. It’s a big difference if one has 20% line coverage or 100%. To achieve the former is relatively easy once one adopts the unit testing approach in principle. It’s even not too hard to get 85% of line coverage. However, in order to get 100%, one needs a very high-quality modular code.
If it’s not 100%, you will never know whether the problem stems from something completely unimportant or if it’s a missed edge case. The only possibility is to develop an automatic custom of maintaining 100% line coverage all times.
We want our unit test suite to pass at least once through every line of source code. This is not the same as 100% branch coverage where all possible branches are executed under all possible conditions. If code is developed following the Test-Driven Development approach, the 100% branch coverage would be achieved automatically as a side effect. However, when talking about unit testing of legacy code, 100% branch coverage might be an impractical goal to strive towards.
Exploratory Manual Tests
The Agile Testing philosophy does not preclude manual tests. Actually the opposite is true: Exploratory manual tests are treated as an ultimate part of the Agile Testing portfolio. The emphasis here is on the word exploratory. Our automatic tests are as good as our knowledge of the system. If we missed something in the test specification, it will help very little that the target software passes all the tests. The only way to address this risk is to play with the system, trying to break it in some unusual, hard to predict way (today I would say just wearing hat of a naïve user). Some Agile teams adopt a practice to engage the whole team in the exploratory test session at the end of iteration; some teams delegate this task to domain experts, while others combine both techniques. For non-UI products, a specially tailored exploratory testing environment would be required. For example, exploratory tests of non-UI modules developed in C/C++ could be performed using the bmock library console mode.
If a discrepancy between desired system behavior and Acceptance Tests specification is discovered, this should preferably be reflected in a change request rather than in a bug report. Stakeholders too often change their mind once they get an opportunity to play with a real, even only partially functional, system. Treating all these changes of mood as bugs could easily create a false impression of the product quality and would hurt the team's motivation.
Agile Testing Summary: The Key Points
- Agile Testing is about bug prevention rather than bug detection.
- All regression tests must be automated. Otherwise the development speed will eventually slow to a crawl, and maintaining multiple branches in version control system will be inevitable.
- Acceptance Tests specify requirements for each User Story in a form suitable for automatic execution. When the system passes its Acceptance Tests suite this means that its functionality satisfies all existing requirements. Acceptance Tests do not guarantee either proper handling of all possible edge cases, or proper maintainable code structure.
- Unit Test suites should provide a test for each branch of every method of every class which leads to a 100% line coverage. The 100% coverage of lines of code guarantees a reasonable handling of all possible edge cases, and maintainable and highly modular code structure. To avoid exponential growth of the unit tests' complexity, mock objects are usually required.
- Exploratory manual tests are performed at the end of each iteration by the whole team and/or domain experts in order to check if there is something missing in the formal specification of Acceptance Tests. Usually, people engaged in Exploratory Tests are trying to break it in unusual and hard-to-predict ways. Avoid interpreting of discrepancy between desired system behavior and Acceptance Tests specification as bugs, but rather, convert them into change requests.
- In order to achieve a proper level of software code quality one has to combine all types of tests: Acceptance, Unit, and Exploratory. Additional types of Integrated Tests (e.g. Endurance, Stress) are added to the automatic regression test suite where and when appropriate.
Thursday, August 20, 2009
Testing Untestable
- If I, Uncle Bob, who has been teaching the whole world how to TDD, cannot test it, this bug in non-testable.
- My class unit test is the minimal unit test possible, because see above
- If I cannot test it using my favorite IDE, it's untestable
- If I cannot test it using my unit test library (JUnit is this case) it's untestable
- If I cannot test it in batch mode using my favorite build system (say, ant), it's untestable
- We all have blind spots. Gurus and experts are especially prone to this, since they are too much convinced by the rest of the world that they do know the best. If such a thing like untestable bug does exist, it should be verified and analyzed by a larger community. What if some of us, mere morons, will find a way to test it?
- Agile test automation need a very accurate definition of terms and conditions (see below). What one developer considers as a minimal unit test could still be an integrated test under certain angle of view (see below).
- Tools are very important and useful things, but they are by no means identical to the unit test practice. If something cannot be tested using JUnit it does not mean a unit test is not possible. It might require some more imagination and effort, but still be possible.
- Unit and Acceptance test automation suite specify unequivecally that if the system does pass all these tests it does behave according to requirements under assumptions reflected in tests. There is no claim that the system does not contain bugs in some sense. Even more this test automation suite IS the system requirements. Anything else are just wishes or speculations. If, for example, it's essential for our system that Java HashSet has a fixed order of elements when converted to List and had duplicates (R. Martin's case), we have to specify an automated test, which validates this assumption (in practice it's a bit more complicated, see below).
- For every branch or every method of every class from whatever we decide to be our system core it is possible to write a unit test, which validates that this particular branch is developed according to the specification. All assumptions about the class surroundings are reflected in the unit test using mocks.
- At the system boundary it is possible to introduce simple adapters, which will make unit testing of the core more convinient. Unit testing of these adapters might be impractical and therefore they should not contain any essential functionality, but rather to just raise a level of interfaces.
- For every assumption about the system software behavior it is possible to write a simple, unit test, that disproves this assumption. The opposite, that is to write an automated test, which proves that all assumptions about the underlying system are correct in general case is not possible, or at least is not practical.
- By passing all unit tests it is not possible to draw a conclusion that our system behaves correctly as a whole. For that purpose an acceptance test suite is required. As it was stated above collectively unit and acceptance test suites specify under which assumptions what functionality the system has to provide.
- It is not possible to proof that the system will never fail, will do things not reflected in the automated test suite, or will not have some unpredictable defects emerging from putting multiple features together. The latter could be spotted only with a manual exploratory test.
To keep things simple in this blog I do not address the issue of additional types of tests such as stress, endurance, etc. See my separate post on the subject. Now to the specific point mentioned in the Robert Martin's post. My interpretation is as follows. There was an implicit assumption made somewhere in new Fitnesse design that Java HashSet will preserve the order of elements in convertion to List, even when there are duplicates. There was a suspicion that this assumption is incorrect and Robert came up with a conclusion that this kind of bug is not unit testable. Quatation from his blog:
"Unfortunately, the order of the rows in the list that was copied from the set is indeterminate.
Now, maybe you missed the importance of that last paragraph. It described the bug. So let me re-emphasize. In the case of duplicate rows I was depending on the order of elements in a list; but I built the list from a HashSet. HashSets don’t order their elements. So the order of the list was indeterminate.
The fix was simple. I changed the HashSet into an ArrayList. That fixed it. I think…
The problem is that I had no way to reliably create the symptom. I could not write a test that failed because I was using a HashSet instead of an ArrayList. I could not mock out the HashSet because the bug was in the use of the HashSet. I could not run the application 1000 times to statistically see the fault, because the behavior did not change from run to run. The only thing that seemed able to sometimes expose the fault was a recompile! And I wasn’t about to put my system into a recompile-til-fail loop."
As I (Asher Sterkin) mentioned I was unable to get a failing test due to the lack of some details, but I still do claim that it's always possible to create a simple unit test, which disproves ANY specific assumption about the underlying system. Here is a Java class I wrote specifically for this purpose:
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
public final class TestSetDuplicates {
public static void main(String[] args) {
Set<String>rawSet = new HashSet<String>();
String values[] = new String[] {
"SuiteChildOne.SuiteSetUp",
"SuiteChildOne.TestOneOne",
"SuiteChildOne.TestOneTwo",
"SuiteChildOne.SuiteTearDown",
"SuiteChildOne.SuiteSetUp",
"SuiteChildOne.TestOneThree",
"SuiteChildOne.SuiteTearDown"
};
for(String i : values)rawSet.add(i);
List<String> list = new ArrayList<String>(rawSet);
for(String i : list) System.out.format("%s \n",i);
System.out.println("");
}
}
This is obviously not a JUnit test and strictly saying is not test at all. It's a part of what's going on to be a unit test for HashSet to List conversion functionality. It also doesn't do too much, and this is a very important point: whatever test code we write it must do as little as possible in order to avoid any side effects we may be not able to predict.
require 'ftools'
def compile_run()
File.delete('TestSetDuplicates.class') if File.exist?('TestSetDuplicates.class')
`javac TestSetDuplicates.java`
return `java TestSetDuplicates`
end
first = compile_run()
puts first
1000.times do |i|
print "\r#{i}"
current = compile_run()
raise "Inconsistent HashSet Behavior(#{i}): #{current}" unless first == current
end
- I misunderstood the Rob's problem. The most probable cause. Perhaps I just need to free up some time, to grab the Fitnesse code from github and to investigate it first hand.
- The problem is correct but it fails only on Rob's computer, on his operating system, and/or on his version of JVM and JDK.
- HashSet to List conversion is determinate and the problem is elsewhere.