Valenstein, P. (1990). Evaluation of diagnostic tests with imperfect standards. American Journal of Clinical Pathology, 93, 252-258. Saah, A. J., Hoover, D. R. (1997). “Sensitivity” and “specificity” reflected: the meaning of these notions in analytical and diagnostic attitudes. Annals of Internal Medicine, 126, 91-94. The FDA recommends that your labeling be able to indicate diagnostic test results for all users (laboratories, health care providers and/or home users). In the development of a new diagnostic test, it is often necessary to compare the performance of the new test with that of an existing method.

If the tests in question give qualitative results (for example. B a test indicating the presence or absence of a disease), the application of measures such as sensitivity/specificity or percentage matching is well established. Different methods are needed for tests that yield quantitative results. The Dunet et al. paper published in this issue provides an example of the use of some of these methods.1 The diagnostic accuracy of a new test refers to the extent of the agreement between the result of the new test and the standard of reference. We use the standard reference term defined in STARD. That is, a reference standard is “considered the best available method for determining the existence or absence of the target condition.” It divides the intended use population into only two groups (existing or absent) and does not take into account the results of the new test being evaluated. Thibodeau, L.A.

(1981). Analysis of diagnostic tests. Biometrics, 37, 801-804. Two points 95% confidence intervals for a positive percentage agreement and a negative percentage agreement according to the standard unre referenced results observed (ignore variability in the non-reference standard) (78.8%, 96.4%) ( 93,5 %, 98,8 %). A 95% bilateral confidence interval for the total agreement is 92.4%, 97.8% of points. See Altman et al. (2000) and the latest edition of CLSI EP12-A for a brief discussion on calculating score confidence intervals and, alternatively, how to calculate accurate confidence intervals (Clopper-Pearson). The Intraclassical Correlation Coefficient (CCI) is an alternative to the Pearson correlation, which is more suited to comparing diagnostic tests. It was first proposed by Fisher4 and is defined assuming that the results of diagnostic tests follow a unilateral ANOVA model with a random effect on the object. This random effect takes into account the repeated measurements for each subject. The ICC is defined as the ratio of variance between subjects (sigma_ “alpha”{2}) to global variance, which consists of variance between subjects and variance within the subject () (“sigma_” {2})