Structuring the error injection tests
Error injection being a complex feature demands focus and cleanliness in all the areas. Also the error injection can easily contribute to about 30–40 %…
Error injection being a complex feature demands focus and cleanliness in all the areas. Also the error injection can easily contribute to about 30–40 % of the total tests. This is not a small number.
Tests written for the error injection should be well structured for two primary reasons. First in order to make tests easy to debug and second improve the reuse of the code across the tests. Considering the total contribution of the error injection tests good reuse can reduce the test development effort.
Some common characteristics of the tests, apply to error injection tests as well
- Tests contain Stimulus generation and Response checking
- Tests will be of two types: Directed tests, Constrained random tests
Error injection test stimulus and response checking
One key point, which is, often ignored, is data traffic being a part of every test. Make sure there is a data traffic flowing through before the error injection, after the error injection and after recovery sequence completion. This is very important because we are building the protocol for communication of data. So all the tests need to exercise data traffic with whatever else they are doing. Because whatever else they are doing is to aid the reliable and efficient data communication.
Error injection tests will be characterized by:
- Stimulus containing
- Type of the error injection being exercised
- Trigger for recovery
- Checks to be performed on DUT
- FSM state
- Interrupts and configuration status registers
- Recovery sequence
Error injection tests directed versus constrained random selection
Now the key thing is deciding which tests should be directed and which should be constrained random tests. The errors that have high probability of occurrence and have error recovery sequence implemented in the hardware are clear candidates for the constrained random verification. This is because you want to exercise them rigorously. The scenarios, which have low probability of occurrence and software based recovery sequences are fine to be exercised with the directed tests. Ideally everything should be constrained random if you have the luxury of schedule and resources.
Error injection test typical structure
Data traffic generation can use existing sequences, which have been developed for the normal operation verification. Make sure not to jump into error injection verification unless there is some stability in the normal operation verification is achieved.
Setup the error injection in either directed or constrained random manner. In the directed cases, the tests itself will create the error configuration with the specific error injection type to be exercised. It will program the same to BFM. In the constrained random tests the weights are programmed for the errors to be enabled.
After the error injection do the required checks. The checks will have to check if there is any error reporting associated. All the errors detected may not be reported but for the ones reported will have to be typically checked by reading the configuration registers. There may be requirements to check the states of some key finite state machines (FSM). In fact error injection tests may be contributing to the FSM functional coverage as well.
After the reporting and state checks, the recovery sequence will have to be checked. The recovery sequence trigger has to be clearly identified. The recovery trigger can be
- Corrupted protocol data unit itself
- Timeout in case of missing protocol data units
- Protocol data unit following the corrupted protocol data unit
- Other
Recovery mechanism could be built in to hardware or initiated by the higher-level application. Typically when it’s handled by the higher-level application it will be some form of reset. Whereas when it’s handled by hardware it will use sequence of the predefined protocol data units. Recovery sequence is checked by the BFM.
After recovery sequence check completion, any additional checks as to clearing of some the status registers or state-indicating readiness for normal operation may have to be checked.
After completion of recovery sequence do not forget adding data traffic before calling it an end of test.
Some optimizations while writing the tests are possible. Based on the commonality of the recovery mechanism it may be possible to combine multiple error injections in a single test file. In such tests, the type of error to be exercised could be passed through the command line. This will minimize the number of the tests to be maintained.