Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I’m not a huge fan of turning the BI-RADS classification scheme into a ROC curve. From what I’ve seen, BI-RADS is something like a yes / no / maybe scheme for mammograms. I don’t think it was designed to be treated like a test score, so using it to generate a ROC curve feels like an unfair comparison between the AI system and current clinical practice.

What they’re doing is interesting, but it’s still very academic. I have little doubt that eventually some sort of AI system will benefit clinical practice, but based on the sheer number of studies that fail to make it over the line, I’m not sure I have high hopes for this one. Why they’ve done so far is the equivalent of “It works in vitro...”



As I understand, for comparison, they also turned AI system output to BI-RADS class and then back to ROC curve.


Huh, where does it say that in the article? I don’t think I spotted that.

All the same, it feels sort of beside the point to me. It just doesn’t feel right to take a medical diagnostic tool - whose intended purpose is for communication among doctors - and treat it as a test score. That’s just... not what it was designed for.


In page 3, "Readers rated each case using the forced BI-RADS scale, and BI-RADS scores were compared to ground-truth outcomes to fit an ROC curve for each reader. The scores of the AI system were treated in the same manner (Fig. 3)."

This isn't as clear as I want it to be, but Fig. 3 shows both "AI system" and "AI system (non-parametric)" ROC curve. My understanding is that the former is fit from discrete BI-RADS class, and the latter is "raw" output.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: