More

houshuang · on Jan 27, 2022

I've gotten jobs at three international remote companies, all paying above $150k, with no whiteboard and definitively no leet code. Although domain expertise helps, and working in the open (a track record of open source code, blog posts, interactions, communication style).

At current company, we're open to hiring in Europe, and we do a ~4 hr take home assignment, which is then discussed in the meeting. Evaluation is just as much around documentation and showing your thinking, as it is around the code quality - very important qualities for a remote team.

houshuang · on May 15, 2020

Interesting approach. I learnt Chinese in university 20 years ago, and spent a long time in China both with traditional text books (self study) and just living there... Very curious about the approach of learning characters without the pronunciation - would love to hear about how it goes in the future when the author moves on to learning words and grammar etc.

This made me think about my own process of learning and improving Chinese, wrote up a bunch of rough notes here. Might try to experiment with a few things going forwards. https://notes.reganmian.net/a--chinese

houshuang · on May 7, 2019

One neat thing about live collaboration is that every single edit is stored. This enables some nice things, like seamless replay of the entire document editing history. We've been working for a while on trying to analyze edit histories, to see if we could predict which stage of editing a document is in, which writing strategy users use, how a small team is collaborating etc. (Main purpose is to support teachers using this with student groups). We have some code here that works with Etherpad and ShareDB (https://github.com/chili-epfl/FROG-analytics), and happy to share preprints with anyone interested.

rodneyt · on May 8, 2019

Hi Houshuang - very interested in this area. My company produces a web app for document-based collaboration in Higher Education (working with Universities in Australia and NZ over the last 10 years). We are also research-active and involved in collaborative research projects with a number of Unis. Please contact me via my profile. ~ Rodney

nateps · on May 7, 2019

Hey houshuang! Thanks for mentioning this. I do think that replay and the potential to invert operations (including some operations and not others) is a very interesting feature of OT and we use it at Lever quite often. It is incredibly useful when doing enterprise customer support in addition to something that you can build user-facing features around.

houshuang · on May 7, 2019

I am curious about this argument - I think CKEditor 5 made a similar one in this epic blog post about how they implemented real-time collaboration (https://ckeditor.com/blog/Lessons-learned-from-creating-a-ri...). We're using Quill.js with ShareDB, which supports JSON structures (which is great, because for us we often have documents with several rich text fields, and other complex structures). So far we've been able to do anything we wanted with Quill, and I've never felt limited by the data structures we have available... (We also do all kinds of other stuff with ShareDB JSON).

I guess one reason you could need custom types would be to ensure consistency - if two keys depend on each other, and one user sets one key, and the other user sets another key, and the document is now invalid, you'd need the engine to be able to reconcile at a higher level?

scofalik · on May 7, 2019

> (...) if two keys depend on each other, and one user sets one key, and the other user sets another key, and the document is now invalid, you'd need the engine to be able to reconcile at a higher level?

I am not sure if I understand you correctly here, but it's not really that. Could you give me a more concrete example?

The kind of problems for extra types are, for example: user A changes a paragraph to a list item and user B splits it. As a result you'd like to have two list items instead of a list item followed by a paragraph. This is impossible if you don't give more semantic meaning to the operations.

There are other problems though, as you mentioned - with invalid document. For example, you have this kind of a list:

* Foo

__* Bar

__* Baz

User A outdents "Bar" and user B indents "Baz" creating a list like this:

* Foo

* Bar

____* Baz

In CKE5 this is an incorrect list (we don't allow indent differences bigger than one). This cannot be fixed through OT so we fix it in post-fixers which are fired after all the changes are applied.

scofalik · on May 7, 2019

Author of the linked blog post here.

The example cases for additional types / custom implementation are in this section: https://ckeditor.com/blog/Lessons-learned-from-creating-a-ri...

These content-preservation edge cases weren't possible to solve with what was available (at least at the time we started the project).

Even apart of that, ottypes/json0 was lacking some basic things, like moving objects. I see they came up with a new implementation recently (https://github.com/ottypes/json1) and it allows moving objects. Maybe the new implementation would solve some problems. However, it is in "preview" state, and the last update was 2 months ago, so I am not sure how well it will be maintained.

Also, there are some edge cases when transforming ranges (which CKE5 use to represent, for example, comments on text or content created in track changes mode). I don't want to bury you in difficult to understand examples but if you are interested you might want to check the examples listed in inline codes for this function: https://github.com/ckeditor/ckeditor5-engine/blob/master/src....

As far as Quill.js is concerned, it is based on the linear data model, which brings limitations when it comes to complex features. Transformation algorithms for linear data models are much simpler and there are more implementations and articles in this area. Everything depends on your needs. If Quill.js features set and functionality fit your needs then the solution you chose is correct.

With CKE5 however, we didn't want to go on any compromises. We needed complex structures for our features, and for having a powerful framework - to enable other developers to write whatever feature they want and have those features working in real-time collaboration. We wanted transformation algorithms which will handle all the edge cases. It is true, some of those cases are quite rare. And the old "10/90" mantra applies here, in this case "10% of use cases brings 90% of complexity". But those edge cases happen and we didn't want to disappoint our users.

mmacfadden · on May 7, 2019

I think the argument is more about the historical data structures that were used in rich text. A lot of editors either used the DOM, or a very flattened data structure like Google Wave, Quill.js, DraftJS etc. With these flattened data structures it becomes harder to represent complex rich text with things like tables, nested blocks, etc. If you have a nice JSON data structure that is collaborative you can do a lot, and in many / most use cases it is sufficient. However, you can run into use cases where the collaborative data structure will ensure consistency across clients, but violate some semantic constraint on the data.

For example, imagine you have an application that has a list that must contain at least one element. Assume there are two elements in the list. A Shared JSON data structure on its own (that allows for immediate local edits) would to allow two clients to simultaneously delete one element each. The end result is that the client app on both sides will become aware that the constraint was violated only when the remote operation comes in. Resolving this becomes difficult. What is the resolution strategy? Which of the two clients should initiate it? This is a contrived example for sure. But you run into things like this in various use cases, and occasionally you need either new data structures that encode these semantics, or you need an extendable system that allows you to customize constraints, and resolutions.

That said, you can get pretty far with just JSON!

houshuang · on May 7, 2019

tmbb has been doing work on writing a ShareDB compatible backend in Elixir (https://elixirforum.com/t/realtime-collaboration/9736/5), this has long been a dream of mine. If not, having a separate Node server that just deals with ShareDB documents would also be feasible - all the app logic could still be in Elixir. Happy to share stuff we've been doing with Quill, for example my company recently sponsored work on shared cursors/presence for Quill.

houshuang · on May 7, 2019

That's basically what I'm trying to build - we're starting with Quill.js and ShareDB, adding extensions, configuring, and then adding a plugin system based on React, which can take arbitrary components all based on ShareDB sync (think spreadsheets, forms, but I'd also love to integrate the Pyiodide stuff to have runnable Python code inside a rich text document, etc). We built this as part of a synchronous collaborative learning platform, but I'm currently trying to extract the key components so that it can be a stand-alone open source library that others can help build on. Very early prototype: https://www.npmjs.com/package/@chilifrog/reactive-rich-text

houshuang · on May 7, 2019

Thanks for the component you provided for shared cursors in text areas (https://github.com/convergencelabs/html-text-collab-ext). I've been using it with ShareDB, and have been working on adapting it to work with single text inputs, which has been surprisingly difficult (mainly dealing with scroll/overflow behaviour).

mmacfadden · on May 7, 2019

Our pleasure! If you want to share what you are doing we might be able to incorporate it into the utility for you! Happy to support a pull request, or to work collaboratively on it.

houshuang · on May 7, 2019

Surprised the author didn't mention Quill.js, which works really well with ShareDB, and is fully open source. We've been doing a lot of fun stuff with it - here's a talk I gave recently as a job talk: https://www.youtube.com/watch?v=gN37rJRmISQ. (It worked, I got hired :)). As a side project, I'm working on a ShareDB backed wiki, where we can use rich text for editing pages, but also other components, like spreadsheets or mindmaps, and everything supports live editing - I think it would be amazing for classrooms, hackathons, any kind of meeting etc. Would love to talk with anyone interested in collaborating. I have been waiting for a few features to complete to make a proper demo video, but this early one shows off some of the ideas: https://www.youtube.com/watch?v=9-lU-in3ydc (https://github.com/chili-epfl/FROG)

nateps · on May 7, 2019

Thanks for the shoutout! We are actively developing on ShareDB (https://github.com/share/sharedb) and if anyone is really interested in this, please reach out to me (https://github.com/nateps). Also, Lever is looking to hire someone to work full time on our internal + open source frameworks including ShareDB.

oefrha · on May 7, 2019

Thanks for ShareDB! Funny thing, I was looking for a self-hosted real-time collaborative editor myself just a week ago, and landed exactly on Quill.js backed by ShareDB. (I also considered SharedDB-backed Monaco, but it doesn’t seem to support OT out of the box, whereas Quill is literally plug and play. Of course one can write a translation layer between Monaco’s change events and OT.)

nateps · on May 7, 2019

That's awesome! Let us know how it goes.

nused · on May 13, 2019

Enjoyed your demos. Quill.js delta fits nicely with OT (and we do mentioned in https://arxiv.org/pdf/1905.01517.pdf).

Here is a Quill.js coediting demo, without using a ShareDB backend: https://codepen.io/dnus/pen/OojaeN

VvR-Ox · on May 7, 2019

I like what you've done.

Is the FROG repo you linked the repo of the wiki?

I'm just curious as it doesn't mention anything about it being a wiki in the readme (it's also a bit unclear how/what to use it for from the description).

houshuang · on May 7, 2019

Yeah, we need to do a lot around messaging, to be clear - the wiki is in an early prototype and there are known bugs, etc. However if you want to try it out, install the repo (run initial_setup, then npm start server), then go to localhost:3000/wiki/ANYTHING. It will create a new wiki named ANYTHING?login=YOURNAME, and there you are.

Or go to https://icchilisrv3.epfl.ch/wiki/hn/Home?login=YOURNAME to test it immediately. (I added a tiny bit of content).

houshuang · on April 27, 2019

If you enjoyed this article, you might also enjoy the novel The Great Passage by Shion Miura, a wonderful quiet novel about the creation of a Japanese dictionary.

jacobolus · on April 27, 2019

Also an anime, https://en.wikipedia.org/wiki/The_Great_Passage_(TV_series)

AlbertoGP · on April 27, 2019

I’ve watched this series and can recommend it for anyone interested in the subject.

houshuang · on April 2, 2019

People interested in block styled editors and new takes on editors should look at hax-the-web, a big effort using web components primarily designed for educational materials at PSU, but usable for so much more. It's quite amazing the flexibility that is possible - here's one recent video, there are many many more on his channel https://t.co/eXM6yBQA7h