Tuesday, May 25, 2010

Software Quality Assurance at Design Time

Software engineering practise is called service engineering when it focuses on building telecommunications services as software components. Examples of popular services are Call Forward, Call Hold, Call Waiting, Voicemail, etc. Interestingly, service engineering is highly software quality oriented. In this post, I conjecture that the “general” SDLC (other than building telecommunications services) has a lot to learn from service engineering. Focus will be on software quality assurance at design time.
One of the differences between service engineering and “general” software engineering is the telecommunications Feature Interaction (FI) problem. The latter attracted the interest of a small international academic community mainly between 1985 and 2006. In this period of time, the problem has been thoroughly studied and many effective applications have been deployed by the big players of the telecommunications industry. Small companies and many of the big but emerging telecommunications companies often build development teams that don’t have previous telecommunications background and did never hear about the FI problem. In such conditions, there is a miss of an important process of software quality assurance.
A feature is a small service. A service is built by assembling several features together. For example, Call Forward is developed using the feature that receives a call request, plus the one that routes a call to a given destination, plus a timer, plus an announcer that plays: “your call is being transferred”, etc. The feature interaction problem (FI) is the undesirable situation that arises when two features or more, running together, interact so that one at least displays an unexpected behavior. FI are considered as software integration defects. For example, you have programmed your home phone to block calls attempted to a given number because you don’t want your kids to dial that number. This is called Call Screening or Call Blocking. Your little smart monsters however, discovered the benefits of FI a long time before. And now they are ready to make a suitable workaround. A friend has to program a forward to that forbidden number on his cell and then they have to simply dial that friend’s number in order to be forwarded to the forbidden one. This is a FI between the Call Forward service and the Call Screening service.
Any software development or quality assurance manager wants to see the maximum of software defects avoided at the earlier stages of the SDLC. I guess service engineering managers are among the happiest software managers in the world. That’s because detecting FI situations is done at the feature design time. So, the process is part of the SDLC. When a new service has been designed, a model of that design is compared with all the other service models that have been already deployed. If there is any FI, the service design is modified until there is no interaction. To model a service, languages like SDL or LOTOS are often used. Service model comparison is performed by an automatic formal verification. This is not our purpose here.
Let’s go to the most important part of the story: FI causes. The latter are the runtime software execution conditions that produce FI. I think that being aware of those causes can be useful at the design time and when elaborating the integration test cases.
The following is a list of FI causes. I will not give telecommunications world examples. Rather, I will try to prove their generality through general but real examples. I’m confident that readers will be able to easily project them on their own software conceptual sphere. Since we will think in “general” software engineering, “feature” will mean any software function.
1. Assumption violation
1.1 Feature A uses data that is supposed to be static. However, feature B can modify it. Example: in a billing system there are administrator accounts, sales agent accounts, user accounts, etc. Only the administrators and agents can change the prices. A new feature has been added in order to let sales agents have their own sales agents. The designers have forgotten to restrict second level agents rights in order not to change prices in the system.
1.2 Feature A is triggered by an event that is supposed to be produced under certain conditions. However, feature B can intercept that event and therefore the feature A will not run. Example: a server feature that is supposed to react to a given TCP packet but a newly developed feature is intercepting and modifying all the received packets on that socket.
1.3 Feature A gives a meaning to a data that is different from the one given by feature B. Example: SOAP client and server for which a given field has two different meanings.
1.4 Feature A uses a data that is supposed to be unique. Feature B violates this assumption. Example: an IP address is supposed to be unique but there is a Web interface screen that is allowing users to set the same IP for several network appliances.
2. Contradictory actions
2.1 Feature A has to perform an action that is forbidden by feature B. Example: the system admin can lock some database tables while some customers need to access them.
3. Ambiguous event semantics
3.1 Two different situations create the same event. Example: many implementations of SIP (which is a VoIP protocol) send back the response 500 Server Internal Error in situations where issuing a 404 Forbidden or 603 Decline is much better.
As “supplement”, the following is a FI cause for which I couldn’t find general examples. May be because I’m a telecommunications guy.
4. Race condition
4.1 Feature A is supposed to run on a given event timeout T. The new feature B has never to run if A can be run. But feature B is programmed to run on a timeout that is less than T. So, B will always run independently from A. Example: Voicemail programmed on 4 rings vs. Call Forward on 3 rings. If it rings 3 times and nobody answers, the call will go to the voicemail instead of being forwarded to the secretary.

No comments:

Post a Comment