> 100,000 people might not be enough to just deal With flags...
You’re off by orders of magnitude.
500 * 60 * 24 * 365 = 262800000 hours per year / 1.3x playback speed / 2,000 hours of work per person per year = ~101,077 people to watch every uploaded video.
Less if people are watching at higher playback speeds or working significantly more than a 40h week. I would personally be surprised if even 1 in 1,000 videos where flagged by ML at which point you’re talking ~100 people. Though that would be a truly terrible job.
> 500 * 60 * 24 * 365 = 262800000 hours per year / 1.3x playback speed / 2,000 hours of work per person per year = ~101,077 people to watch every uploaded video.
When you have people watch the video at that fast speed and instantly make a decision, you might as well keep the algorithm - the failure rate won't be too different.
Additionally, these people need holidays, managers, infrastructure and an actual flag will probably not instantly be clear - you'll need to point out the specific offence and make a case. You might even need to look things up. Then you'll have to handle appeals and discussions, because changing the AI blackbox to an appeal-less human black box wouldn't improve the situation much[0]. Next, you need people who can understand the specific language of that video - most will be english, but what happens when a Nigerian video is flagged? So you need people from that country or at least familiar with it. Plus infrastructure.
Overall, you're probably looking at something approaching 400 or 500 people, with this very low flag rate. Assuming conservative 30k$/year [1], this is 15 million USD alone. Doable, yes, but its not the no-brainer you make it out to be.
Reading the article, "human error" really isn't a reasonable description of what happened:
> The Facebook furor began when the social network deleted the famous photo from Norwegian author Tom Egeland’s Facebook page, where it was part of a series of memorable wartime imagery.
This could be a human error, sure.
> When Egeland subsequently posted his shocked reaction to the removal of the “napalm girl” photo, he found his account suspended.
But not this.
> Norway’s largest newspaper, Aftenposten, published Egeland’s story on the censorship, only to find that its own Facebook posts were also quickly deleted.
Definitely not this.
> Espen Egil Hansen, Aftenposten’s editor, then took to the front page of his paper to slam Facebook in an open letter to CEO Mark Zuckerberg
> Prime Minister Solberg joined the debate on Friday, only to find that her comments and posts about the suppressed photo were also deleted by Facebook.
This is a (ridiculously severe) problem in several of Facebook's policies, not a classification error.
> Reading the article, "human error" really isn't a reasonable description of what happened:
I'm pretty sure to have read that the initial blocking was done by a contractor - I can find a better source, if you want.
> This is a (ridiculously severe) problem in several of Facebook's policies, not a classification error.
Yes, and that is exactly the point I tried to make:
>> changing the AI blackbox to an appeal-less human black box wouldn't improve the situation much
Said another way, having underpaid contractors with no time for consideration (and possibly strange policies) is exactly as bad. This is a pretty clear case: That photo should not have been blocked and the idea behind having a human do the evaluation is that he recognizes the historical relevance and, if not outright allowing it, at least consults the relevant authorities whether this is exempt. This could've been done by an AI all the way without a different outcome.
People often watch YouTube videos at 2x speeds, it’s part of their app to select 1.25, 1.5, 1.75, 2x because you can generally listen to most speakers at that speed just fine. Let alone 11 hours of ocean waves etc.
Only averaging 1.3x assumed significant overhead and going back to re listen to segments, have an escalation queue for flagged videos, and randomly assigning multiple people to the same videos to ensure people are paying attention not just playing solitaire while the video is on. Plus other random crap, in other words ~30% overhead.
[disclaimer] I work at Google, all opinions are my own. I don't deal with Youtube but have had exposure to moderation operations. It is not an easy task.
You are incorrect about your assumptions in many respects. Let me list out a few:
1) There is a big difference between watching a 15 min video at 2X for fun vs watching 8 hours of videos a day while having to follow laid down policy with complexity. Videos are not taken down only for one reason and there is a lot of complexity involved with edge cases. Humans are not robots and this is not a task we are inherently good at. Give it a shot yourself for a day and see how easy it is.
2) Your overhead does not consider any additional complexity in terms of languages, specialty, regional expertise etc. This is not a simple problem either. Sure maybe you can hire 1000 reviewers in lets say Indonesia but you cannot find a native Kswahili speaker there. Its not cheap to setup an office for 2 people in Kenya. Scale this to the world which has 193 countries and 6,500 languages.
3) People dont work without any management overhead. You need frontline reviewers. Then you need a layer of experts above them. The you need managers to take care of the operations. The managers to manage those managers. Then recruiters to hire those people. Then HR to deal with their issues.
4) Turnover is a big deal. Very few people in the world can watch beheadings 40 hours a week. An even smaller proportion can handle child safety material. What happens when those people need to take time off? You need additional people. If you want a humane operation then you might have to get them to work only 2 hours a day.
It continues to amaze me how some of the smartest and technically savvy people on HN either do not recognize or refuse to admit how complicated a problem planet scale moderation is.
> It continues to amaze me how some of the smartest and technically savvy people on HN either do not recognize or refuse to admit how complicated a problem planet scale moderation is.
It's probably a variation of the trivial-core-problem [0]. When looked at roughly, it seems very easy; only when you start implementing it, you'll see all the edge cases appear. I made pretty similar points above to the ones you made - but on a quick napkin calculation, Retric's math checked out, to be fair. And when you're not in the business, missing the inherent complexity is easy.
Building systems is clearly a major effort which was specifically excluded from that estimate. However, while you’re right it’s complicated, I would point out my overall estimate was close to Google’s own.
“Since we started using machine learning to flag violent and extremist content in June, the technology has reviewed and flagged content that would have taken 180,000 people working 40 hours a week to assess.”
That said, I am surprised Google isn’t using machine translation for obscure languages. Props to them that’s going above and beyond in my personal opinion.
I'm not saying it's not complicated, but complaints about expense and overheads for a company that makes something like $40 billion a year in profit ring a little hollow. That's roughly the size of the UK's defence budget.
You’re off by orders of magnitude.
500 * 60 * 24 * 365 = 262800000 hours per year / 1.3x playback speed / 2,000 hours of work per person per year = ~101,077 people to watch every uploaded video.
Less if people are watching at higher playback speeds or working significantly more than a 40h week. I would personally be surprised if even 1 in 1,000 videos where flagged by ML at which point you’re talking ~100 people. Though that would be a truly terrible job.