# Receiver Operating Characteristic (ROC) Curve

The ROC Curve is a plot of values of the False Positive Rate (FPR) versus the True Positive Rate (TPR) for all possible cutoff values from 0 t o 1.

Example 1: Create the ROC curve for Example 1 of Comparing Logistic Regression Models.

The first portion of the analysis from Comparing Logistic Regression Models is shown here in Figure 1.

Figure 1 – Logistic regression data analysis

We begin by taking the observed values of success and failures in Logistic Regression summary format together with the calculated p-Pred values (i.e.  columns H, I and L from Figure 1) and sorting these by the p-Pred values. This can be done using the SelectCols supplemental function  as follows (referring to Figure 1):

=SelectCols(H6:L17,”5,2,1”,3)

The result is shown in columns AI, AJ and AK of Figure 2.

Figure 2 – ROC Table and Curve

Next create the cumulative values for Failure and Success (columns AL and AM) and then the values of FPR and TPR for each row (columns AN and AO). E.g. these entries for row 8 are calculated via the following formulas:

Figure 3 – Selected formulas from Figure 2

The ROC curve can then be created by highlighting the range AN6:AO18 and selecting Insert > Charts|Scatter and adding the chart and axes titles. The result is shown on the right side of Figure 2. The actual ROC curve is a step function with the points shown in the figure.

Observation: The higher the ROC curve the better the fit. In fact the area under the curve (AUC) can be used for this purpose. The closer AUC is to 1 (the maximum value) the better the fit. Values close to .5 show that the model’s ability to discriminate between success and failure is due to chance.

For Example 1, the AUC is simply the sum of the areas of each of the rectangles in the step function. The formula for calculating the area for the rectangle corresponding to the p-Pred value in row 8 (i.e. the formula in cell AP8) is shown in Figure 3. The formula for calculating the AUC for Example 1 (cell AP19) is =SUM(AP6:AP18). The calculated value of .845587 shows a pretty good fit.

Observation: The Real Statistics Logistic Regression data analysis tool automatically creates the ROC curve as described above.

### 24 Responses to Receiver Operating Characteristic (ROC) Curve

1. raseeda says:

Hi ..this is good example for me.,I have a question hope u don’t mind to answer.
If lets say, I have set of training data. Is ROC used to evaluate the threshold that I got fro this training data? If let say the ROC rate is 0.9, then this model of training and threshold data is good to test another data.?

Thanks

• Charles says:

Raseeda,
An AUC value of .9 is quite high and so I would think that the model would be good enough to test other values.
Sorry that I haven’t responded sooner. I seemed to have overlooked your question until now.
Charles

2. Rose says:

Hi Charles,

I have been puzzled on how to organize the data in excel until I found your blog.
Its help, thanks !

• Rose says:

Appreciate if you could answer my question. What if I am testing existing models i.e. logit/MDA (hence, using the same coefficient, plugged in the variables weight into the model), for that I only have the probability/scoring results. How should I perform/derive ROC from this results? Many thanks.

• Charles says:

Rose,
Charles

• Rose says:

ok, Charles.
thank you

3. Jimmy says:

It is a good overall on how to perform ROC
Is there a way using Excel to compare diagnostic accuracies of three diagnostic methods using 3 ROC curves?

• Charles says:

Jimmy,
I don’t completely understand your question. Can you give me an example?
Charles

• Charles says:

Jimmy,
Probably, but I haven’t done this yet in my software. Another potential future enhancement.
Charles

4. Jimmy says:

I have 3 methods to diagnose whether a segment of nerve is torn or not, I want to compare their diagnostic accuracies in correctly diagnose a torn nerve by comparing their ROC.

• Charles says:

Jimmy,
I don’t current provide any means of comparing ROC’s in the software. Of course if the ROC are generated from my software they can be compared visually.
Charles

5. Lizzie says:

Hi,

Congratulations, its a great blog!

My question is: I have 2 classifiers (A/B).The classificador A had a better accuracy and B a better AUC. So, which should I use?

Thank you so much!

Lizzie

• Charles says:

Lizzie,
How are you measuring “accuracy”?
Charles

6. BK says:

Hey Charles,

Just a quick question. What are success and failure values? I have a set of data from which I calculated sensitivity and specificity. I have different cut-off values. I simply tried scatter plotting sen vs 1-spe but it did not create the ROC curve. Most of the points are clustered together. Am I doing something wrong?

Thanks!

• Charles says:

It does sound like you have done something wrong, since the ROC curve should have a characteristic shape, but I would have to see your data to really be able to answer your question.
Charles

• BK says:

I am not sure how to post my data here but roughly, I have two groups (normal and positive). For each group, there are 80 samples. I calculated cutoff values for each ‘marker’, which I extracted from the raw data with my own methods. Then for each marker, I used the cutoff values to categorize each sample into either n (normal) or p (positive). For example, for one marker or cutoff value, I assigned n or p depending on whether the raw data value exceeds the cutoff or not. As you know, from this, I can calculate specificity and sensitivity.

Then for each marker (since each marker has its own cutoff value) I graphed sen vs 1 – spe.

7. Suveena says:

i dont know ,How to perform test set versus training set and its calculation.
I have a set of data. I want to calculate true positive and false positive. some pls say

• Charles says:

Sorry, but I don’t understand your question.
Charles

I have a question:I’ve done some classification algorithms in weka software.Now I want to draw a ROC curve or PR curve according to weka results.How can I do it?

• Charles says:

I am not familiar with weka, but as long as it can output a plain text file you can use Excel’s Data > Get External Data | From Text capability to get the data into Excel.
Charles

9. Kat says:

Hi
I have conducted a study where an imaging modality was used diagnose a disease and that was compared to a gold standard laboratory test. I used a “truth table” (2×2) table to calculate sensitivity, specificity, PPV and NPV. I not sure which values or how to reorganise my data to be able to use the values to create a ROC curve on excel. Please help

10. Nikki says:

Hi Charles,
I have generated a ROC curve and AUC in graphpad prism as I have continuous data (rather than binary) in both a control and treatment group. From the output I have then calculated Youdens Index to ascertain the cut off point that differentiates between the two states, this being 0.56. I now need to back calculate what value (in terms of concentration) the index is referring to, to give a meaningful result. I have read elsewhere to use a logistic regression equation to do this but am not sure of what variables to include or whether this is the correct approach. Would you be able to shed some light on this please.

The data that graphpad has generated includes; AUC, std error, 95% confidence and pvalue in addition to sensitivity, specificity and likelihood

Here is the link by Ivm that refers to a regression equ. https://communities.sas.com/t5/SAS-Statistical-Procedures/ROC-in-SAS-obtaining-a-cut-off-value/td-p/161354

Any assistance you are able to provide would be appreciated, thanks

• Charles says:

Hi Nikki,
I am not that familiar with this topic. The following article may be helpful though:
www4.stat.ncsu.edu/~bondell/OptimCut.pdf
Charles