ENDORSEMENT

One of the many problems associated with traditional identification aids is that they are too trustful. The data entered by the user are accepted as correct.

Users being human, errors will occur. With traditional identification aids such as dichotomous keys, any error is lethal and the user, sent to the wrong end of the key, is irretrievably lost. Identification will fail (no taxon fits the specimen to be identified) or a wrong name will be reached.

With computer identification systems, the concept of graceful degradation (gradual loss of performance in response to a decrease in resources) is used with various meanings (Fortuner, 1993), but this expression generally refers to the gradual loss of the ability of the system to correctly identify an unknown specimen as an increasing number of errors are made in the description of this specimen. Graceful degradation is the opposite of the brutal loss of performance of a dichotomous key. It is naturally part of identification methods such as the computation of similarity coefficients (if the computation is based on, e.g., 20 characters, the result will be 100% correct if all characters are correct, 95% correct if one error is made, 90 % correct in the case of two errors, and so on). Elimination methods such as dichotomous keys or multiple entry keys (polyclaves) normally do not degrade gracefully. However, computer applications of such methods can be made to degrade somewhat gracefully. For example, a taxon may be retained in the list of possible answers if it differs from the unknown only by one character. This postpone the lethal consequence of an error made on this character. Of course, lethality would reappear in full strength if a second error is made.

A radical solution to this problem would be to use only correct characters, i.e., characters with a low probability of error. For example, it is less likely that an error will be made on the character "Leaf length" in centimeters than on the character "Leaf shape" with possible states defined as linear, linear-lanceolate, lanceolate, and ovate-lanceolate.

With GENISYS, the reliability of data entered may be assessed (if the user requests it) as an "endorsement score" computed by the system.

Four areas that influence the reliability of the data have been defined:

1. The expertise of the user. Obviously, data entered by an expert are more to be trusted than data entered by a beginner.

2. What we call the PIF of the user, from a French slang word meaning the nose it also means the intuition. If you prefer, in English, PIF means Personal Intuitive Feeling. Very often, we know when we are guessing at a character rather than actually seeing it. A high PIF value means we know the data is good. A low PIF value means we are not so sure.

The PIF itself depends of two factors, how clearly the character was observed, and how consistent it was from one specimen to the next. If a character was clearly seen in all specimens observed, then that PIF will be high. If the character is said to be maybe present in one of a few of the specimens, the PIF will be very low.

3. A third area is the general observation set-up. For nematodes, the domain of expertise of the biologist member of the GENISYS team, this includes the number of specimens and their quality (for nematodes: freshly killed, well fixed, or distorted specimens), and the type and quality of the optics. Other biological domains may have slightly different observation set-ups.

4. The last area is the character itself, including three aspects:

Computation of the endorsement score

To evaluate the endorsement factor of a character state or value entered by the user, the system must have access to the values attached to all the factors of endorsement listed above. Some are taken from the metadata permanently attached to each character in the schema (conspicuity, ambiguity, variability). Others come from the user data sheet, a permanent description of the user's areas of expertise and the general observation set-up he normally uses (with the possibility of changing the default values). The number and quality of specimens is entered once at the beginning of each identification session. Finally, the only value that needs to be given during the identification itself is the PIF value attached to each character entered.

The computation of the endorsement factor of a character state or value can use an algorithmic approach or a fuzzy rule-based approach (Diederich & Fortuner, 1996).

The formula selected for the algorithmic approach uses the factors described above, except for expertise and Pif, to compute what we call a computed Pif, or c.Pif for short, using an arithmetic combination of the factors after assigning numerical values (0.0 - 1.0) to each entry under the factor. The endorsement can then computed as a weighted average of the Pif and the c.Pif , with the weights determined by the level of expertise, to give the formula

Endorsement = Expertise*Pif + ( 1 - Expertise)* c.Pif. (1)

Thus the higher the level of Expertise, the more the endorsement relies on the Pif, while the lower the level of expertise, the more it relies on the c.Pif.

In the fuzzy rule-base approach, the same three factors (expertise, pif and c.pif) were used to formulate 15 basic rules. The first three rules essentially state "Trust the expert", while the next three rules state "Trust the c.pif." The remaining nine rules cover the intermediate cases. Additional rules can be added to the basic rules to fine tune the system.

Use of endorsement

An endorsement score attached to each character state or value entered by the user may be used in various ways.

For example, any method or tool that proceeds by elimination should use only the most reliable data, i.e., data with high endorsement scores. Elimination should stop when all reliable characters have been used and no taxon should be rejected only because of a mismatch on a non-reliable character (one with a low endorsement score).

Similarity coefficients often include weights attached to each characters. Setting the weights equal to the endorsement scores of the successive characters would give a lower weight to the less reliable characters, those for which an error is most likely to occur. This would enhance the natural graceful degradation of these kinds of approaches. In the example above, if an error is made on one difficult character with weight = 0.1 whereas the 19 other characters are easy characters each with weight = 1, the result will be 99.5% correct in spite of the error made.

__________________
Diederich, J., Fortuner, R. 1996. Endorsement of observations in identification. Proceedings Fifth IEEE International Conference on Fuzzy Systems, Sept. 8-11, 1996, New Orleans, Louisiana, USA, 175-179.