Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How I Judge the Quality of Documentation in 30 Seconds (2014) (ericholscher.com)
93 points by Tomte on Oct 5, 2018 | hide | past | favorite | 59 comments


Here's a summary:

1. Get a website: don't use a readme on GitHub.

2. Use Prose: don't generate from source, you need more.

3. Give permalinks for citation purposes

4. Your URL should acknowledge the documentation version and language, for future-proofing

And now for my own opinion:

1. No, not every project finds it rational to put time into creating a website. Especially if they're small. Sometimes, the time is better spent doing dev work than creating a pretty site to satisfy you.

2. Rails' documentation is generated from source, so is Node's. A lot of documentation is. They give accompanying guides, sure, but there's nothing wrong with generating the majority of your documentation from source.

3 & 4. These are nice to have, sure, but not requirements. If a project is small, I'd rather the solo dev spend their free time solving tough problems.

Essentially: if you're judging a whole project in 30 seconds, I think you're the problem and not the developer that spent all their time making a project for you to use. So, go ahead, "close the tab" as many times as you like there. But if you really want to help someone, open a pull request, or at least file an issue in a curteous time.


Disagree on the readme.md. At work, we recently moved away from dedicated project/component documentation pages and toward readme.md.

It's the best way to ensure that the documentation is in sync with the implementation: the developer is more likely to update the readme.md in the same pull request he/she changes the implementation.

On Github, I seldom read the market-y website. I click the "Fork me on Github" link and go to the documentation pages on the source itself. It's the most reliable, in my view.


Likewise, I always consult the GitHub docs (usually README.md) and code rather than see the website (which is sometimes out of sync) but usually projects which do not have a dedicated website do not have good READMEs either. This is my experience with full-stack JavaScript libraries/plugins.


Also helps when you have to work offline.

I work on projects while I travel, and I always force my package manager to download the source (not dist) version of the package.

The readme files are now included in the project, and it's a lot easier to just read them. Add an MD viewer to the IDE to make things easier.


You really have to go out of your way to not provide 3., any half-decent documentation generation mechanism, even trivial markdown rendering scripts, will have some sort of support for permalinks.

Short of deliberately not versioning your documentation, it's quite difficult to avoid producing working permalinks.

Generating reference from source is fine, but it's definitely insufficient. Trying to pick up a library or a framework from reference alone is like learning a language solely from a dictionary. Documentation should at least include some general design and architecture elements, and tutorials or commented examples.

Writing accurate documentation mostly requires a thorough understanding of the code, it's one of the least easy things to contribute productively as a newcomer.

I'd much rather the developer did not implement some features (which they took the time to list or maybe even sketch out their intentions for in the documentation), and documented what they did implement thoroughly. That's a much better starting point for contribution than a more or less feature-complete project that needs reverse engineering because there is no documentation at all.


> 2. Rails' documentation is generated from source, so is Node's. A lot of documentation is.

I don't think that's the problem he is talking about. Whether you put your prose in a separate file or as a special comment in a source file is not a relevant difference. Having prose, and having a sensible TOC, is.

Broadly speaking, generated docs tend to be worthless for smaller, well-designed packages, i.e. those where you can browse and read the code+comments in your IDE comfortably. But large and poorly designed packages definitely need generated docs, because their code isn't easily traversed.

Basically I think generated references are a necessary evil. If your project needs them I think that's a wrinkle on the project.


I'm struck by how contradictory the author's guidelines seem to be. If your documentation lives in source code and github readmes, then bam! - you automatically have permalinks to everything that are intrinsically synced to the version of the code. If you keep your docs in separate files on a separate website, you get those things iff you remember to manually generate them.


One example of when generating from source can make for a less than ideal experience is the Monaco editor's documentation[0].

- Most modules don't have intro sections to guide you to where you might want to start looking for the most common tasks. You have to guess the kinds of names they might have chosen for the given tasks. Turns out customizing syntax highlighting is mainly done with setMonarchTokensProvider and not setTokensProvider.

- Many methods or classes have redundant docstrings, like how setLanguageConfiguration[1] says "Set the editing configuration for a language" or setTheme() says "Switches to a theme."

- To find out what you can do with instances of a code editor, you don't click "monaco.editor" in the right sidebar like I kept thinking. Instead, you'd click the create() method on that page, and click through to its return type, which is actually IStandaloneCodeEditor.

- The search field only searches on exact prefixes of the basename. This isn't exactly a generate-from-code problem but a typedoc UI issue that led me to just download the code and use search-in-project instead (within VS Code ironically enough).

Fortunately, for the most common tasks, the Monaco playground[2] and Monarch playground[3] were helpful in pointing me to the right direction. So that offsets some of these annoyances. But I actually switched away from generating bland documentation via typeoc for my app, and started using a custom documentation generator that allows me to structure my app's docs in a way that is more human-friendly.

[0] https://microsoft.github.io/monaco-editor/api/index.html

[1] https://microsoft.github.io/monaco-editor/api/modules/monaco...

[2] https://microsoft.github.io/monaco-editor/playground.html

[3] https://microsoft.github.io/monaco-editor/monarch.html


Node.js actually has separate docs[1]. The only documentation you'll find inline in our source is documentation for other developers working on the source.

[1]: https://github.com/nodejs/node/tree/master/doc/api


> Rails' documentation is generated from source, so is Node's.

Indeed! There's no better documentation than the code itself.

Godoc —the documentation tool for Go— also generates it from the source code.

Concise and clear, here is an example input [1] and its corresponding output [2].

[1] https://golang.org/src/net/http/doc.go

[2] https://golang.org/pkg/net/http/


Indeed! There's no better documentation than the code itself.

Except for all the things you need to know in order to use the code. Which are why you still write actual real honest-to-God prose documentation. Or, if you don't, is why I don't bother trying to use your code.


If you read the links given, you'll see that is in the godoc. Godoc has a place to put prose documentation. The formatting is quite weak (you get headers of more than one word, paragraphs of text without italic or bold but with URLs turned into active links, etc, and preformatted-code), but it's often enough. (If it really isn't enough, link somewhere.) Godoc also has official support for examples, which are included in the documentation and tested as part of the automated test suite, ensuring they stay up to date if written even halfway reasonably.

This is not specific to godoc. Most, perhaps effectively all (certainly all the ones I've seen and used), automated code generation tools have a place to put arbitrary prose at the top. The fact that it is so often unused is a developer problem, not a tool problem.


I’ve published Open Source code and documentation. It is a lot of work.

I think that “How I judge someone’s else hours of effort in 30 seconds” is a toxic attitude.


It's often very useful to quickly evaluate a project. I don't think the author of this article intended any disrespect or to diminish someone's hard work. It's more about efficiently navigating a vast landscape of tools. Although, perhaps this could have been made more clear.

Say some random project has 10 stars on github and no commits in the past 2 years. Good documentation and working examples could be the difference between using it and not using it. If you only have a single day to spike the feature, you need to efficiently explore possible solutions.

Obviously, this depends on many factors. How critical is the feature? What are the alternatives? How long would implementing it in house take? And so on.


I'm with parent, even my bad READMEs took a lot of time. Consumers of open source are just so damn entitled. It's discouraging.


OK. I use a very similar set of heuristics to what's described in this post.

Here's the documentation for the most recent thing I released:

https://django-registration.readthedocs.io/en/3.0/

And the one before that:

https://pwned-passwords-django.readthedocs.io/en/1.3.1/

Here's the next of my packages that I'm working on, not because the code needs work but because the documentation isn't up to my standards anymore (in fact, I'm doing a rolling refresh of all my personal packages right now):

https://webcolors.readthedocs.io/en/1.8.1/

I don't expect everyone else to match my output in documentation. I do expect people to write some type of prose documentation covering more than just "here's an auto-generated API reference, good luck", or "here's a README with a couple examples, good luck". And I absolutely treat quality of documentation as a predictor for quality of code, because it tends to be a pretty strong predictor.

Want to tell me how "entitled" I am?


> Want to tell me how "entitled" I am?

Sorry I didn't notice the provocation until now! You are very entitled, and you overestimate how much time most people have to contribute to open source projects. I'm happy if a FLOSS project even provides moderately recent API docs, which apparently upset you.

For what it's worth, I attribute most of Django's success to its top-notch documentation. I've shipped large Django projects and think it's solid software, and I greatly appreciate your contributions. At the same time I recognize that Django won what is essentially a popularity contest. Web frameworks are a crowded space and beginner-friendly docs are required. If I'm open sourcing a library that is the only one of its kind, priorities differ. I don't think you recognize that difference, and you're judging other projects as you would a web framework. Some of the best libraries I've used came with little more than API docs. Selecting open source libraries by the quality of their documentation is a risky practice, to say the least.


Note that "anothergoogler" said that consumers of open source were acting as though entitled, not producers.


Well, I'm both a producer and a consumer of open source code.

So is Eric, who wrote the "entitled" post being complained about.


fantastic work, I agree with you - most of the time the "documentation" for open source work is done by more experienced devs answered questions from noobs on forums like stack overflow and this is a crazy state to live in


It’s toxic and lazy too. You should first go look and see if the actual code is high quality/does what you want.

If your idea of shopping for dependencies is looking at the amount of GitHub stars and the quality of the documentation: you are going to miss some real diamonds in the rough.


Is it laziness or efficiency? It's, in my read, a heuristic (he even uses that word) for deciding whether it's worth spending time reading the docs before looking at the actual code, to see if it does what he wants.

It's about how he should best spend his time, which is finite, after all. Paid time even more so.


It’s a really bad heuristic.

I evaluate projects by (1) actually reading their source code, (2) looking up their frequency of updates and release packaging, (3) the community interaction of authors and contributors.

The total amount of effort it took to write that blog post could have been spent on submitting PRs to improve documentation of various projects.


If I'm looking for a library to solve a problem, and have a choice between multiple options, you'd better believe I'm going to have some "30 second" heuristics guiding me toward what's likely to be worth my time exploring further.

People who put in the effort to have good documentation are more likely to also be producing good software that's worth the time I'll put into trying to learn it.

And no matter how righteous you want to get here on HN in proclaiming that to be "toxic" and trying to read it uncharitably, I'd bet all the money in my wallet right now that you have 30-second heuristics you use to judge whether a piece of software is worth your time to explore.


The judging in this case is judging how much time a developer wants to spend investigating a project that they are are evaluating for their own use. I didn’t read it as being judgemental about other people’s work for its own sake.


Exactly, if everyone was held to this standard, there would be much less projects out there.

Hell, I've thrown some undocumented projects out there with the intention of cleaning them up later or if it gains traction. I have so many side projects that I couldn't possibly reasonably document them all.


By not documenting them you make clear that they're not intended for production use, which is perfectly valid in my opinion. I also do this a lot.


Interesting to see how someone experienced evaluates documentation. I've been doing Curriculum Development for several years, and it's easy to identify bad Curriculum. If I flip through a course and don't see a lot of diagrams (20%-50%) then I know that somebody didn't know all the research showing that diagrams aid recall and simplify the understanding of relationships. And if I don't see frequent labs/practice, then somebody didn't know all the research showing that the brain needs to have new concepts reinforced, and skipping that step prevents new information from being well-absorbed. I could go on, but those two are typical. And people wonder why Training isn't more effective. I have seen bad training sink more than one high-tech company - though nobody at those companies seemed to understand that. There is also an art to properly decomposing complex topics so that they can be "intuitive". My very best technical courses were often criticized as being "easy" and "obvious" - but only by those who hadn't tried to learn the same information before.


> With GitHub Pages, Read the Docs, and other places to host generated documentation for free

> If your documentation is generated from source code, I am immediately skeptical

Looks like another self-important troll. Those statements are just lolwhut.

I'm not going to learn (or force others to learn) a tool to read documentation. Put a /docs folder in the appropriate project directories and have the team that handles it, decide what to put in that (even if it's a URL to somewhere else).

Disregarding existing documentation because you dislike the organization/formatting is such an act of blind ignorance, I'm surprised he thinks he knows anything about it. Documentation is not a solved problem (although it's easier than testing).

> If you included all of the things needed to document a project in source, your code would be unreadable.

That's only because modern IDEs haven't implemented inline-documentation methods yet. One day we'll get companion documents with every sourcefile that will allow developers to read notes and link to associated content from orthogonal comment files.


While I don't agree with the article 100%, I too am skeptical when I see documentation generated from source. A complex topic requires decomposing the information into different types of information with different approaches to explain it. Not everything is "here is the API". For example there may be an important overarching conceptual model, and information which maps into it. I don't know any way to express that in any of the source program documentation tools (at least, I've never seen it done in the many hundreds of projects I've seen.) The problems go well beyond formatting and organization. (Though, some things are simple APIs where some trivial examples and some minimal text can explain it sufficiently. That does describe a lot of existing libraries.)


Sphinx does this better than most since it is a general purpose documentation tool first with API doc generation as an optional add on. After setting it up you can easily add manually written documentation as needed. Most importantly, you can easily cross-reference the API docs and the manual prose from each other.

Doxygen and its ilk are overly focused on generated API documentation extracted from meta-comments and you rarely see them used with well organized manually written text.


I appreciate and agree with all of the author's points, but I think that a few really well thought out and explained examples on a simple Github README is worth far more to me from a practical standpoint. I generally will look at the examples, try them out in a stand alone test, spend a few minutes seeing what the quality of the source code is, and then make a decision to go further with it or not. For a smaller project that only has a limited set of features I think most of the author's points are maybe a bit overly picky.


I'm very thankful for Rustdoc. Every library's documentation has the same, easily searchable format. The syntax for writing documentation is easy to use and standard across all Rust projects. You can easily generate HTML documentation from your source code and standalone markdown files. crates.io will generate your documentation and host it for you when you publish a library. Your example code in your documentation can actually be compiled and tested so it stays up to date. It's been a joy to use and a major reason I like programming in Rust.

I don't know about localization support, but it addresses most of the other issues outlined in the article.

https://doc.rust-lang.org/beta/rustdoc/what-is-rustdoc.html


As hard as rust is in itself, it tries to lower most of the other barriers to entry for new programmers. I think that central package repository coupled with sane package manager and consistent documentation is must have for any new language on the market. Both Rust and Elixir have official package manager and repository with documentation hosting and it makes life much easier. This and not having to worry about another holly war later on.


Localization is basically non-existent :( it’s a thing we want to work on, but aren’t sure how to accomplish.


I've discussed this a fair bit and we have a rough plan, but the idea is to wait for other things to localize first.


Ah great!


I know that writing good documentation is hard, and that a lot of devs don't like to do it, but I completely agree with TFA.

If the project isn't worth the time to document properly, it's self-announcing that it isn't worth using that code. Why would anyone use throwaway code posted on GitHub? Even if it's been written by a "rock star," the risks are too high.

The philosophy of OpenBSD is the correct one, I think: incorrect documentation is a bug and should be fixed with the same energy that is directed toward bugs in code.

If it's only going to be used by you, great! Don't be offended if someone else looks at it and says, no thanks because of the lack of documentation.


Code is the best documentation, it's what I seem to always end up consulting. And always in sync.

Plus some guide to the overall architecture. Now... a pattern language was meant to provide this, but failed. Perhaps, since a program is a theory, to understand the structure of the code, you must understand the structure of the problem/domain... and there's no shortcut. But even if so, a helpful guide seems possible.

Autogenerated docs can help navigate the codebase, but tools can autonavigate just as well. They could show the overall code structure, but don't. They could be in sync, but aren't.

A minimal but complete (all-steps) example, partly to learn from. and build on, and also to assure me that it works.

Idea: I keep kinda expecting the text alongside each file/dir on github to be a comment on that file/dir (instead of its latest commit message). Could be incredibly valuable for navigating a codebase and grokking its architecture. (Though prone to desync.) I also like the format of a commit message for this: <50 char summary line, optional further lines.

code: electronic documentation: edoc


pull the ladder up after you mentality, how is someone new to your language/stack going to understand your code without a similar level of knowledge? this type of thinking leads to abandoned codebases and full product rebuilds in another language with better documentation. currently migrating a product for this very reason.


This was based entirely on my experience with other people's projects that I knew nothing about.


I think the language thing is a bit unfair.

If your service targets non-English speaking countries, 100% you need your documentation translated.

Otherwise - why?


Agreed and also, maybe it will support other languages in the future - just not at this moment.


Given that GitLab/GitHub/etc (and, often, online package repositories) automatically display markdown files as rendered html, a directory of markdown files is a fine way to document your project. The docs can, at any point, optionally be processed into html that goes into a separate website (lots of folks have written tools for this. My own is [Rippledoc](https://gitlab.com/uvtc/rippledoc)).

Also, if the project is a library or framework with a public interface, you need both API docs generated from source, and prose docs.


I noticed a lot of FOSS bug reports/emails for my projects are coming from foreigners lately, and I spent a lot of time thinking about how to help. Translation seemed insurmountable. I considered a lot of options.

My conclusion was to always strive for brevity and clarity. Say it all, say no more, be clear, and be concise. I now consider a foreign audience that may take significantly longer to understand what I write, they may use translation tools, they may be confused by any "flowery language".


Eric runs the yearly Write The Docs conference in Portland every year. I highly recommend attending if you are interested in documentation from a cross-functional perspective.


My company should start hiring from that conference. We really need people who care about docs. Devs usually hate it.


They actually have a fairly active Slack [1] with a job posts channel, as well as a Job Board [2] on their main site.

[1] http://slack.writethedocs.org/ [2] https://jobs.writethedocs.org/


How I judge the quality of a judgment: if its maker boasts how little time they spent making the judgment, it’s probably pretty uninformed.


>Someone shouldn’t have to learn Programming and English at the same time.

However the attitude seems to be very much "learn Chinese/local language or gtfo" from the opposite end wrt localisation ie it is nonexistent. I think devs are really giving up something valuable by not pushing for a global standard language for development.

sure I shouldn't HAVE to learn English and Development at the same time BUT who is developing in my language anyways? no language support, no compiler support,no Unicode support and no readable font support. instead of bootstrapping all of that(an impossible task as there is no requirement and no funding from any sources,i've tried but the govt and general public ranges from apathetic to stubborn on this issue), it was easier for me to learn English. I could have theoretically learnt the language one level up on the linguistic chain but I would eventually hit the same issues as I tried to use the available tools.


It's a nice thought in some ways, but that requirement in particular is where I figured the author must be talking about massive projects, because I'm sure not taking my ~100 man-hour open source project documentation and getting it translated into half-a-dozen languages. A project has to be pretty big (multiple orders of magnitude larger than that) before it can justify having documentation in anything but whatever language the probably-single developer is most comfortable documenting in.


I was excited about this, but it was mostly worthless. The equivalent of a "how to write a great paper" guide explaining font size, where to put your name, to include a table of contents... And that's it.

Nothing about what to write and how to write it? This is the hard part.


Internal development is very different than externally facing development. For example, generated API docs are a horrible thing for purely internal code, because you need a documentation build and deployment system that inevitably breaks and nobody wants to maintain and gets out of sync and suddenly the documentation is stale and there's some engineer on the brink of rage quitting if someone pings them to manually rebuild the documentation pages one more goddamn time.

Having it just in the repo as a set of Markdown docs or something is much better, and "the website" is just URLs to those files in GitHub. Same risk of becoming stale as anything else, but no overhead of documentation build and deployment.


> If you don’t provide a language in your URL, you are implicitly sending the message that the documentation will never be translated.

Is the `accept-language` header basically dead, these days?


Yes, and good riddance; a GET request for a given URL should return bit-for-bit identical results[1] regardless of browser differences[0].

0: Explicitly stateful features such as API endpoints or "logged in as" fields aren't browser differences.

1: The resulting file, not necessarily the TCP stream used to transfer it.

2: Yes, there are exceptions (eg http://canhazip.com/more), but documentation is almost the diametric opposite of being one of them.


Thank God, yes.

I mean, in theory it's great. In practice you always got cringeworthily bad translated versions of the Debian web site.

(A few years ago Debian's German pages got much better)


Why would choosing language via URL be any different than choosing language via header, with respect to the quality of the translation?


For APIs, I find having a good test suite means I can get started understanding how to call the library quickly.


Judging books by covers. First time I've seen it in the wild!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: