the YoloV3 (the model that these tools are designed to work with) paper is extre...

iudqnolq · on Dec 15, 2019

Thank you! My favorite passage:

> YOLOv3 is a good detector. It’s fast, it’s accurate. It’s not as great on the COCO average AP between .5 and .95 IOU metric. But it’s very good on the old detection metric of .5 IOU. Why did we switch metrics anyway? The original COCO paper just has this cryptic sentence: “A full discussion of evaluation metrics will be added once the evaluation server is complete”. Russakovsky et al report that that humans have a hard time distinguishing an IOU of .3 from .5! “Training humans to visually inspect a bounding box with IOU of 0.3 and distinguish it from one with IOU 0.5 is surprisingly difficult.” [18] If humans have a hard time telling the difference, how much does it matter?

> But maybe a better question is: “What are we going to do with these detectors now that we have them?” A lot of the people doing this research are at Google and Facebook. I guess at least we know the technology is in good hands and definitely won’t be used to harvest your personal information and sell it to.... wait, you’re saying that’s exactly what it will be used for??

> Oh. Well the other people heavily funding vision research are the military and they’ve never done anything horrible like killing lots of people with new technology oh wait.....

> I have a lot of hope that most of the people using computer vision are just doing happy, good stuff with it, like counting the number of zebras in a national park [13], or tracking their cat as it wanders around their house [19]. But computer vision is already being put to questionable use and as researchers we have a responsibility to at least consider the harm our work might be doing and think ofways to mitigate it. We owe the world that much. In closing, do not @ me. (Because I finally quit Twitter).

> 1 The author is funded by the Office of Naval Research and Google.

kick · on Dec 16, 2019

For people who don't get how the last statement ties in: on the actual paper, the third line 'iudqnolq quotes has a superset 1 on it. The PDF doesn't allow you to highlight the 1, so the punchline isn't as strong.

iudqnolq · on Dec 16, 2019

I didn't notice I missed that. Thanks

davalapar · on Dec 15, 2019

Lost it at page 4 figure 3 caption:

> You can tell YOLOv3 is good because it’s very high and far to the left. Can you cite your own paper? Guess who’s going to try, this guy→[16]

This guy cites.

iudqnolq · on Dec 15, 2019

I lost it at citation 1, the Wikipedia page for analogy

higginsc · on Dec 15, 2019

Incredible. I wish people would take themselves less seriously like this. I would never have read that paper and learned a bit about that space if it hadn't been such an engaging read.

Iv · on Dec 16, 2019

I love that they include a "Things We Tried That Didn’t Work" section.

At first I was a bit put down by the bloggish tone but it does not obfuscate nor impair communication of information, so yes, pretty good paper!

CrazyCatDog · on Dec 15, 2019

Thank you—this is priceless. Caption to fig 4: “...and we can still screw with the variables to make ourselves look good!“

As an academic, I’ve now seen the light!

dr_dshiv · on Dec 15, 2019

Mind blown, thanks.

What genre of papers does this belong to? I want to read more.

sansnomme · on Dec 15, 2019

Quite a few ML papers are written this way. See e.g. Single Headed Attention RNN

foobar_user · on Dec 16, 2019

Is there a github for the results cited in this paper? If not then I have one confusion despite the paper being awesomely written. We need more like this. I dislike the lifeless way the new research is communicated. I have to force to read them but not to this so much.

Now coming to the question- Is the input size and output size same that is why box top-left coordinated (bx, by) prediction corresponds to offset(cx) + prediction (sigma(tx)) and similar for y?

yunobcool · on Dec 15, 2019

Thanks for the chuckles.