Hacker Newsnew | past | comments | ask | show | jobs | submit | ccostes's commentslogin

I think calling this a "wild-ass guess" undersells it a bit (either that or we have very different definitions of a WAG).Very well though-through and compelling case.

My biggest question is whether composable models are indeed the general case, which you say they confirmed as evidenced by the shift away from non-profit. It's certainly true for some domains, but I wonder if it's universal enough to enable the ecosystem you describe.


So a Christmas card list would be illegal? That seems...excessive.


I think they know it's bad, they just don't need to make it better because its their OS so they're the only game in town. Developer experience isn't why people write apps for mac/iOS, so there's not much incentive for them to invest in it.


Why is that?


He can buy the local Golden Corral any time he wants.


That's awesome! Did you fly helicopters before starting on the instructor path?

Helicopters seem like a blast, but the cost is on another level (and that's speaking as a fixed-wing pilot used to the crazy costs of GA generally).


No, I had no proper flying experience. I used GI Bill benefits to cover the cost.


Aside from the Rust aspect (which is cool!), I can't believe we've come this far and still don't have low-latency video conferencing. Maybe I'm overly sensitive, but people talking over each other and the lack of conversational flow drives me crazy with things like hangouts.


John Carmack always has an interesting point to make about latency: https://twitter.com/ID_AA_Carmack/status/193480622533120001

>I can send an IP packet to Europe faster than I can send a pixel to the screen. How f’d up is that?

and to relate to the other post about landlines: https://twitter.com/ID_AA_Carmack/status/992778768417722368

>I made a long internal post yesterday about audio latency, and it included “Many people reading this are too young to remember analog local phone calls, and how the lag from cell phones changed conversations.”


> Many people reading this are too young to remember analog local phone calls, and how the lag from cell phones changed conversations

Is there somewhere to read about the changes in question?

I'm old enough to remember extensive use of analog landlines, and can't really think of any difference to a cellphone other than audio quality.


In my world, using regular cell service (not VoLTE), seems nearly as instantaneous as I remember analog lines. I remember how hard a satellite phone call was and I never have that much latency in a call.


Isn't this mostly because actually showing a pixel requires a macroscopic change?


Cisco "telepresence" solved this 15 years ago. Standardized rooms on both sides with high quality cameras and low latencies. Polycom had a similar but worse setup at the time. The Cisco experience was very close to being in a shared meeting with the other people. It made meetings across continents work very well and was an actual competitor to flying everywhere. Between the hardware being too expensive and the link requirements being very high I only ever saw it implemented in multinational telecoms for whom it was an actual work tool but also something to impress their clients with.

Either Cisco needed to bring down the cost massively to expand access or someone needed to build it in major cities and bill by the hour to compete against flying. None of those happened so it stayed a niche. Compared to those experiences more than a decade ago the common VC is still very slowly catching up. Part of it is setup, like installing VC rooms with 2 smaller TVs side by side instead of one large one so you can see the document and the other people at decent sizes. But part of it is still the technology. Those "telepresences" were almost surely on a dedicated link running on the telecom core network that guaranteed quality instead of routing through the internet and randomly failing. I suspect getting really low latency will require that kind of telecom level QoS otherwise you'll be increasing buffer sizes to avoid freezes.


Cisco and HP Halo were incredible but the biggest problem they had was 1) the requirement to build out an actual room for it and 2) the shitty software setup experience. The big corporates that could afford to build out real estate for VCs also bogged the shit down in "enterpriseyness" that made the shit impossible to use.


About 10 years ago I go to go on a tour of the Taiwan HP office. One thing that stands out in my mind was the telepresence rooms. Absolutely fabulous, large table, with screens across the table that showed high fidelity low latency image of whoever was sitting at a connected table.


Latency was still a huge issue with the HP Halo. I remember a specific meeting where they talked about upgrading the audio codec which didn't seem to address things much. It was kind of a running joke that any applause or laughter would and with a huge, noticable lag between locations.


I worked at a company that had a Cisco telepresence machine on wheels. You had to make sure it was plugged into a certain color Ethernet wall jack for it to work but every room had one. You could reserve it and then wheel it to the conference room you wanted.


That's nothing like a Cisco telepresence room. You have to have used one to understand. It's nothing too sci-fi -- not floor to ceiling curved displays or whatnot -- but just the multiple large TVs all in a curved setup on the other side of a curved table makes a huge difference.


And a standardized wall color and camera location, so that everyone that joins in from another telepresence room blends in as if they were really there.



It would seem like they relaxed rules about what's in the background. But then, my knowledge is from a Telepresence room having been setup at a previous employer somewhere between 10 and 15 years ago (and I wasn't directly involved).


It would be interesting if a camera was on top of every tv, so that you have a 1-to-1 with every recipient.

That way, when you turn your head to the person on each tv, it would seem as if you were actually looking at them.


Getting off topic here, but this makes me think of what can be seen now in some Japanese programs because of social distancing measures. I don't know what kind of setup they have, but in some programs, from the spectator's perspective, you see people lined up behind a table, but some of them are actually on large monitors that make them appear at the right size. The interesting thing is that the ones on monitors act like if they were actually there, turning their head in the direction of the person speaking.


What Japanese programs?


ひるおび is one of them IIRC.


My first job out of school was doing product verification for the cameras that were used in those Cisco systems! It was pretty impressive, I think they managed to squeeze 1080p at 60fps over USB2. Had a lot of fun building jigs and testing setups to test the MTBF on a tight time frame


The biggest problem is that of the video codecs which ultimately boils down to using interframe compression. This technique requires that a certain # of video frames be received and buffered before a final image can be produced. This requirement imposes a baseline amount of latency that can never be overcome by any means. It is a hard trade-off in information theory.

Something to consider is that there are alternative techniques to interframe compression. Intraframe compression (e.g. JPEG) can bring your encoding latency per frame down to 0~10ms at the cost of a dramatic increase in bandwidth. Other benefits include the ability to instantly draw any frame the moment you receive it, because every single JPEG contains 100% of the data. With almost all video codecs, you must have some prior # of frames in many cases to reconstitute a complete frame.

For certain applications on modern networks, intraframe compression may not be as unbearable an idea as it once was. I've thrown together a prototype using LibJpegTurbo and I am able to get a C#/AspNetCore websocket to push a framebuffer drawn in safe C# to my browser window in ~5-10 milliseconds @ 1080p. Testing this approach at 60fps redraw with event feedback has proven that ideal localhost roundtrip latency is nearly indistinguishable from native desktop applications.

The ultimate point here is that you can build something that runs with better latency than any streaming offering on earth right now - if you are willing to make sacrifices on bandwidth efficiency. My 3 weekend project arguably already runs much better than Google Stadia regarding both latency and quality, but the market for streaming game & video conference services which require 50~100 Mbps (depending on resolution & refresh rate) constant throughput is probably very limited for now. That said, it is also not entirely non-existent - think about corporate networks, e-sports events, very serious PC gamers on LAN, etc. Keep in mind that it is virtually impossible to cheat at video games delivered through these types of streaming platforms. I would very much like to keep the streaming gaming dream alive, even if it can't be fully realized until 10gbps+ LAN/internet is default everywhere.


Interframes are not a problem, as long as they only reference previous frames, not future ones.

I was able to get latency down to 50ms, streaming to a browser using MPEG1[1]. The latency is mostly the result of 1 frame (16ms) delay for a screen capture on the sender + 2-3 frames of latency to get through the OS stack to the screen at the receiving end. En- and decoding was about ~5ms. Plus of course the network latency, but I only tested this on a local wifi, so it didn't add much.

[1] https://phoboslab.org/log/2015/07/play-gta-v-in-your-browser...


It's funny you mention MPEG1. That's where my journey with all of this began. For MPEG1 testing I was just piping my raw bitmap data to FFMPEG and piping the result to the client browser.

I was never satisfied with the lower latency bound for that approach and felt like I had to keep pushing into latency territory that was lower than my frame time.

That said, MPEG1 was probably the simplest way to get nearly-ideal latency conditions for an interframe approach.


Wouldn't you then hit issues where a single dropped packet can cause noticable problems? In an intraframe solution if you lose a (part of a) frame, you just skip the frame and use the next one instead. But if you need that frame in order to render the next one, you either have to lag or display a corrupted image until your next keyframe.

I guess as long as keyframes are common and packet loss is low it'd work well enough.


Corrupted frames happen; they're not too bad. You can also use erasure coding.


Interesting. I guess I'll have to rewrite a lot of code if what you are saying is true.


You can also just configure your video encoder to not use B-frames. Then if you make all consecutive frames P frames then the size is very maintainable. It gets trickier if your transport is lossy since a dropped P frame is a problem but it's not an unsolvable problem if you use LTR frames intelligently.

All the benefits of efficient codecs, more manageable handling of the latency downsides.

The challenges you'll run into instantly with JPEG is that the file size increase & encoding/decoding time on large resolutions outstrips any benefits you get in your limited tests. For video game applications you have to figure out how you're going to pipeline your streaming more efficiently than transferring a small 10 kb image as otherwise you're transferring each full uncompressed frame to the CPU which is expensive. Doing JPEG compression on the GPU is probably tricky. Finally decode is the other side of the problem. HW video decoders are embarrassingly fast & super common. Your JPEG decode is going to be significantly slower.

* EDIT: For your weekend project are you testing it with cloud servers or locally? I would be surprised if under equivalent network conditions you're outperforming Stadia so careful that you're not benchmarking local network performance against Stadia's production on public networks perf.


I tested: localhost (no network packets on copper), within my home network (to router and back), and across a very small WAN distance in the metro-local area (~75mpbs link speed w/ 5-10 ms latency).

The only case that started to suck was the metro-local, and even then it was indistinguishable from the other cases until resolution or framerate were increased to the point of saturating the link.

One technique I did come up with to combat the exact concern raised above regarding encoding time relative to resolution is to subdivide the task into multiple tiles which are independently encoded in parallel across however many cores are available. When using this approach, it is possible to create the illusion that you are updating a full 1080/4k+ scene within the same time frame that a tile (e.g. 256x256) would take to encode+send+decode. This approach is something that I have started to seriously investigate for purposes of building universal 2d business applications, as in these types of use cases you only have to transmit the tiles which are impacted by UI events and at no particular frame rate.


Actually, there are commercial CUDA JPEG codecs (both directions) operating at gigapixels per second. It's not a question of speed, but rather the fact that you can at least afford to use H.264's I-frame-only codec for much lower bandwidth requirements.


JPEG is still going to be larger & lower quality than H264. I still fail to see the advantage.


~10x higher framerate?


Almost every hardware codec I've seen supports JPEG. MJPEG is certainly more rare than the more traditional video algorithms, but it certainly gets used.


You can also eliminate I-frames and have I-slices distributed among several P-frames, so that you don't have spikes in bandwidth (and possibly latency if the encoder needs more time to process an I-frames)


I think a larger issue is the focus on video as opposed to audio. Audio may be less sexy but it is far and away more important for most interpersonal communication (I'm not discussing gaming or streaming or whatever, but teleconferencing). Most of us don't care that much if we get super crisp, uninterrupted views of our colleagues or clients, but audio problems really impede discussion.


Video is related to this though. If audio is synced to the video then a delayed video stream also means a delayed audio stream.


In my approach, these would be 2 completely independent streams. I haven't implemented audio yet, but hypothetically you can continuously adjust the sample buffer size of the audio stream to be within some safety margin of detected peak latency, and things should self-synchronize pretty well.

In terms of encoding the audio, I don't know that I would. For video, going from MPEG->JPEG brought the perfect trade-off. For reducing audio latency, I think you would just need to be sending raw PCM samples as soon as you generate them. Maybe in really small batches (in case you have a client super-close to the server and you want virtually 0 latency). If you use small batches of samples you could probably start thinking about MP3, but raw 44.1KHz 16-bit stereo audio is only 1.44 mbps. Most cellphones wouldn't have a problem with that these days.

Edit: The fundamental difference in information theory regarding video and audio is the dimensionality. JPEG makes sense for video, because the smallest useful unit of presentation is the individual video frame. For audio, the smallest useful unit of presentation is the PCM sample, but the hazard is that these are fed in at a substantially higher rate (44k/s) than with video (60/s), so you need to buffer out enough samples to cover the latency rift.


Discord does something like what you describe. It's kind of awful for music(e.g. if it's a channel with a music bot) as you'll hear it speed up and slow down in an oscillating pattern. The same effect also appears in games if you should have a game loop that always tries to catch up to an ideal framerate by issuing more updates to match an average - the resulting oscillation as the game suddenly slows down and then jerks forward is hugely disruptive, so it's not really done this way in practice.

Oscillations are the main issue with "catch-ups" in synchronization, and dropping frames once your buffer is too far behind is often a more pleasant artifact. It's not really a one-size-fits-all engineering problem.


Audio conferencing at low latency is already solved by things like Mumble (https://www.mumble.info/). I think adding a video feed in complete parallel (as in, use mumble as-is, do the video in another process) with no regard for latency would be a pretty good first step to see what can be achieved.


Early versions of Youtube nailed this. The video would frequently pause, degrade, or glitch due to buffering delays but the audio would continue to play. This made all the difference in user perception: youtube felt smooth. Other streaming services would pause both video and audio which did not feel smooth at all. Maybe they had some QoS code in their webapp to prioritize audio?


one technique that could be used (to get high compression rates on compression applied to each frame) is to train a compression "dictionary" on the first few seconds/minutes of a data stream, and then use the dictionary to compress/decompress each frame.


Well, all the effort is regularly defeated by poor hardware - you can have 40ms latecy in the video call stack, but when people attach Bluetooth headphones which buffer everything for 300ms there's nothing really to be done.

(Be gentle on your coworkers and use cabled headphones.)


LLAC/LHDC LL bluetooth codec adds only 30ms.

AptX low latency codec adds only 40ms max.

Just buy headphones with good low latency support. They aren't even expensive anymore.


Bluetooth audio is a mess of compromise. The default sbc codec is basically fine for low latency but the parameters are all pretty terrible. Everyone uses the same few default parameters which neither give particularly high quality (especially for two way audio which was designed to be compatible with phone quality), nor low latency (especially for the high quality a2dp profile). One issue is that the designs/defaults haven’t really been updated since about 2000, and the parameters are very hard to change, typically the OS’s preference is hardcoded somewhere (also, whichever device initiates the connection gets to choose the parameters so even if you configured your computer to choose “better” parameters, it would all be for naught if you let the headphones connect to the computer rather than the other way round). The other issue is that Bluetooth is quite severely bandwidth constrained and higher bandwidth could theoretically give lower latency.


>LLAC/LHDC LL bluetooth codec adds only 30ms.

"only" is positive thinking.

I do play some rhythm games (LLSIF, deresute, mirishita) on Android. The difference between "only adds 30ms" and plugging my headphones directly to the headphone jack is the difference between unplayable and playable. The games do have a latency compensation setting (with a calibration procedure), but compensation is no substitute for the real thing: Low latency.


LLAC/AptX LL isn't adopted well on host device now, especially on Apple devices.

And even 30ms delay, using it on headphone/mic and both talker means 3022=120ms delay.


Okay, but I want to wear wireless headphones.

Why can't I have both? Wifi doesn't seem to have this latency problem.


How do you know? :)

The latency doesn't come from bluetooth radio part itself (there ARE low latency BT headphones after all).

It comes from the fact that all audio is encoded (usually into SBC or AAC or AptX), transmitted and then decoded in the headphones. And each of those steps has buffers. And those buffers are configured by the manufacturer.

The bigger the buffer, the more stable the audio connection - there's less stuttering, less dropouts. But every buffer in the chain adds latency.

So why can't you have both? You sure can. You just need to somehow find headphones and a PC that doesn't add latency to bluetooth. Sadly that's not something that's usually documented in technical specs.


Or use wireless mics that don't use bluetooth and are dedicated to low latency wireless audio. Like the ones they use for theatre: https://www.adorama.com/alc/how-to-choose-a-wireless-microph...


Wifi's latency has a high dispersion. I've seen absolutely terrible wifi latency, and latency that is under 1ms. wifi degrades gracefully, which makes it really tough to work with.

But pretty much all serious gamers use an ethernet connection because wifi is a pain in the ass. In fact, the first thing a support representative for any game will tell you when complaining about excessive lag is to try a wired connection.


WiFi has terrible latency. Try playing a multiplayer FPS with wired networking and compare with WiFi. Or simply use remote desktop with WiFi.


Whatever wifi you're using is probably overloaded. You can easily have a one millisecond ping to your access point.


I have an under 2 ms ping to my AP, but WiFi has terrible buffer bloat, so ping latency doesn't mean much actually.


Any idea what the state of the art is for reducing buffer bloat on access points?

And you can mitigate that by not using tons of bandwidth in the background while gaming.


Terrible latency _and_ packet drop.

I only use wifi where I cannot attach a cable. I will run 15m ethernet cable on an apartment's floor if I have to, in order not to have to use wifi.


I believe RF based wireless headphones (like my Arctis 7 headphones) don't have this latency in them due to not being Bluetooth based.

There is some patented codec I think that does allow low latency bluetooth streaming (forgot the name) but that's not heavily implemented in my experience.


Old-school BT headsets are low-latency enough, afaik. But yeah, just blasting the Opus directly from the network to the headphones would solve it, even re-coding in low-latency configuration only adds 5ms.


You probably mean AptX Low-Latency. I haven't seen it a lot and it's basically just AptX with tweaked buffer sizes.


> Wifi doesn't seem to have this latency problem.

Wifi is one of the best things you can do to add unreliability and latency.


There are hard limits at play. No matter what you do, you can't go from New York to London in less than ~20ms; add video/audio encoding, packet switching, decoding, etc. and it's easy to see why any latency under the 100ms mark at that spatial scale in a scalable, mainstream product would be close to a miracle.

The thing is that when we talk in a room, sound will take <10ms to reach my ears from your mouth. This is what "enables" all of the human turn taking cues in conversation (eye contact, picking up whether a sentence is about to end/whether it's a good time to chime in/etc) - I've been looking for work from people who've tried to see at what point things start feeling really bad (is it 10ms, or 50ms?), but haven't found much so far. No matter what it is though, it's likely that long distance digital communications just cannot match it.

See also this interesting comment about the feeling of "closeness" from phone copper wires:

https://news.ycombinator.com/item?id=22931809

Landlines were so fast and so "direct" in their latency (where distance correlates very directly with time, due to a lack of "hops") that local phone calls were faster than the speed of sound across a table, and for a bit after they came out--before people generally got used to seemingly random latency--local calls felt "intimate", like as if you were talking to someone in bed with their head right next to you; I also have heard stories of negotiators who had gotten really tuned to analyzing people's wait times while thinking that long distance calls were confusing and threw them off their game.


> it's easy to see why any latency under the 100ms mark at that spatial scale in a scalable, mainstream product would be close to a miracle.

It seems normal phones are able to do it, though. At least it seems normal phones suffer less latency problem.

In a way, simplicity in technology often means better performance.


Linux is ill-suited for realtime applications.

Google is well-aware of this, thus Fuchsia.

SeL4 would make a good base for such a device.


The media lab has done a ton of research on this. I seem to remember people being able to notice visual latency at 30ms and audio latency at 80-120ms (this is because light is faster than sound).


>and audio latency at 80-120ms

Any rhythm game player will disagree.

Some games (e.g. llsif, for android) have "perfect" window sized to 16ms (a video frame). Even with latency compensation, these are unplayable on bluetooth yet fine on headphone jack. As the game has calibration, the resulting offset is seen to be at least 30ms worse on bluetooth.


Interesting, would love to read more if specific papers/authors come to your mind. I suspect there's a big gap between e.g. "noticing the audio latency when audio is played as a result of pressing a button" vs "audio latency affecting the flow of a multiparty conversation".



it's probably the latter, because the former is about 5ms (which is equivalent to the statement, "how short of a time between sounds are they perceivable as separate" aka the lower frequency threshold of hearing). It's non obvious that they're the same limit.


> The thing is that when we talk in a room, sound will take <10ms to reach my ears from your mouth. This is what "enables" all of the human turn taking cues in conversation (eye contact, picking up whether a sentence is about to end/whether it's a good time to chime in/etc) - I've been looking for work from people who've tried to see at what point things start feeling really bad (is it 10ms, or 50ms?), but haven't found much so far. No matter what it is though, it's likely that long distance digital communications just cannot match it.

Digital communication could cheat, though!

There's a lot of latency hiding you can do, if you can predict well enough what's coming next. Humans are fairly predictable most of the time.


Where does Tonari actually put the camera? The perspective on the displayed image makes it look like the camera is ceiling mounted, but that would make the eye contact problem much worse than even Zoom.


If I had to guess at a possible future, I can imagine edge computing servers that connect over 5G or fiber to your device. On these edge computing servers, they predict using AI/ML what you, as a participant, could do (video including facial and hand gestures, audio including Toastmaster type fillers like ahh, umm) in the next 50-60ms or longer and transmit their guess using rendered video frames and audio in time for the other videoconferencing participants to see “no latency” interaction. Done right, it would seem real. Done wrong, definite Max Max Head Headroom feel.


I'm wondering where the raw video from the helicopter came from originally. My guess is that this isn't something you can do with an SDR and hobbyist equipment, but I would love to be proven wrong.


https://www.youtube.com/watch?v=2MprHxarmOI

The footage is direct from the news organisation. I'm guessing this is footage from their FLIR mounted camera, maybe captured streaming to their HQ. Is that what you mean?


Agreed, the "if anything management is getting screwed here" got me good. Still not convinced management, who walked away with millions in cash, got screwed, but it was a well-crafted article.

Edit: I should clarify that by "walking away with millions" I was referring to their normal cash comp. 2019 CEO cash comp was $3.6MM, and she got $700k as a retention bonus.


Do companies really have any other choice but to hand out a bunch of cash to senior management? Nobody really wants to hang around and go down with the ship, but someone has to stick around to manage the reorganization. These weren't bonuses to "walk away" with cash, these were bonuses for them to stick around.


Where else were they going to go right now? The entire auto industry, and especially the rental side, are effectively stalled and not hiring new people. In a normally functioning market I would agree with you, but in this case it really comes across as a ridiculous and needless handout.


Maybe the market downturn can account for why these bonuses are less than we normally see. But these are senior execs we're talking about; it is not uncommon for them to switch industries. Their job specialty is managing a company, not renting cars. The current Hertz CEO has only been in the car rental industry for two years.


I'll repeat the question then - where are they going to go? Unless you think all of them could transition to SaaS companies I'm not seeing how it's relevant that they're senior execs.


Their current CEO spent 28 years at Walmart, and Walmart (as well as several other retailers) has had great financial performance recently.

Or many of them could probably afford to sit on their butt at their beach home for a few years and ride out the recession.

I don't see any reason why they couldn't work at a SaaS company though. It's relevant that they're senior execs because their job position is so far removed from the actual product that they really only need a general understanding of it.


I'm going to guess you've never seen how senior executive recruiting actually works. I regularly work with senior-level executives in automotive, tier 1 supplier, and software companies and can tell you for a fact that right now these people would not easily find jobs anywhere else. Without the retention payouts their options would have been 1) keep working through the wind-down and receive their normal level of pay and benefits or 2) have no job and get the 55k/yr equivalent in unemployment with no benefits. I 100% do not believe very many, if any, would have opted for the latter option and sitting around on the beach. This payout was unnecessary.


> 2) have no job and get the 55k/yr equivalent in unemployment with no benefits.

... plus income from directorships, investments, etc.

People with high net worth don't need to rely on consistent income the way the rest of us do. A lot of them are later on in their careers and could potentially retire as well.

There are other options too, like quit and become a consultant, which is exactly what Marinello did.

It is very common for senior management to run to the door during a bankruptcy, even during an economic downturn. If there's any argument that this payout is unnecessary, it is that senior management doesn't really care about the money as much as it is about the situation.


$16 million over 340 employees (if you believe the article) is $47,058.82 per. Not exactly a golden parachute if you ask me.


Assuming even distributing of funds among those 340 people is a little hard to believe as well.


> walking away with millions

I was referring to their normal compensation, should have made that clear.

According to the article the CEO got $700k from the $16MM, and her 2019 cash comp was $3.6MM.


Totally forgot that Boeing owned them. They have been on fire lately with all the new features and I hope this doesn't impact them.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: