Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

the YoloV3 (the model that these tools are designed to work with) paper is extremely funny and worth reading for anyone who hasn't.

https://pjreddie.com/media/files/papers/YOLOv3.pdf



Thank you! My favorite passage:

> YOLOv3 is a good detector. It’s fast, it’s accurate. It’s not as great on the COCO average AP between .5 and .95 IOU metric. But it’s very good on the old detection metric of .5 IOU. Why did we switch metrics anyway? The original COCO paper just has this cryptic sentence: “A full discussion of evaluation metrics will be added once the evaluation server is complete”. Russakovsky et al report that that humans have a hard time distinguishing an IOU of .3 from .5! “Training humans to visually inspect a bounding box with IOU of 0.3 and distinguish it from one with IOU 0.5 is surprisingly difficult.” [18] If humans have a hard time telling the difference, how much does it matter?

> But maybe a better question is: “What are we going to do with these detectors now that we have them?” A lot of the people doing this research are at Google and Facebook. I guess at least we know the technology is in good hands and definitely won’t be used to harvest your personal information and sell it to.... wait, you’re saying that’s exactly what it will be used for??

> Oh. Well the other people heavily funding vision research are the military and they’ve never done anything horrible like killing lots of people with new technology oh wait.....

> I have a lot of hope that most of the people using computer vision are just doing happy, good stuff with it, like counting the number of zebras in a national park [13], or tracking their cat as it wanders around their house [19]. But computer vision is already being put to questionable use and as researchers we have a responsibility to at least consider the harm our work might be doing and think ofways to mitigate it. We owe the world that much. In closing, do not @ me. (Because I finally quit Twitter).

> 1 The author is funded by the Office of Naval Research and Google.


For people who don't get how the last statement ties in: on the actual paper, the third line 'iudqnolq quotes has a superset 1 on it. The PDF doesn't allow you to highlight the 1, so the punchline isn't as strong.


I didn't notice I missed that. Thanks


Lost it at page 4 figure 3 caption:

> You can tell YOLOv3 is good because it’s very high and far to the left. Can you cite your own paper? Guess who’s going to try, this guy→[16]

This guy cites.


I lost it at citation 1, the Wikipedia page for analogy


Incredible. I wish people would take themselves less seriously like this. I would never have read that paper and learned a bit about that space if it hadn't been such an engaging read.


I love that they include a "Things We Tried That Didn’t Work" section.

At first I was a bit put down by the bloggish tone but it does not obfuscate nor impair communication of information, so yes, pretty good paper!


Thank you—this is priceless. Caption to fig 4: “...and we can still screw with the variables to make ourselves look good!“

As an academic, I’ve now seen the light!


Mind blown, thanks.

What genre of papers does this belong to? I want to read more.


Quite a few ML papers are written this way. See e.g. Single Headed Attention RNN


Is there a github for the results cited in this paper? If not then I have one confusion despite the paper being awesomely written. We need more like this. I dislike the lifeless way the new research is communicated. I have to force to read them but not to this so much.

Now coming to the question- Is the input size and output size same that is why box top-left coordinated (bx, by) prediction corresponds to offset(cx) + prediction (sigma(tx)) and similar for y?


Thanks for the chuckles.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: