Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Linus on Line Breaks (2020) (lkml.org)
244 points by belter on Nov 28, 2021 | hide | past | favorite | 188 comments


Shorter lines are easier for the eye to visually scan without accidentally skipping up or down during the leftward return saccade.[1]

80 characters happens to be just a little longer than the supposedly ideal measure for continuous text promulgated by Robert Bringhurst.[2]

I take the even more draconian approach of wrapping to 72 columns when writing comments, after the fashion of PEP 8.[3]

Of course, Linus will dictate the style of his projects' codebases as he deems fit.

[1]https://en.wikipedia.org/wiki/Saccade

[2]http://webtypography.net/2.1.2

[3]https://www.python.org/dev/peps/pep-0008/#maximum-line-lengt...


- Code is not prose. It's not read like a book so it's not clear that an "ideal measure for continuous text" is at all relevant.

- Most lines are short even if the limit is more than 80 characters, so you're unlikely to accidentally shift up or down after reading a long line.

- Even if there turns out to be some benefit for reading code based on Saccades, that means nothing unless you can make some quantitative comparison with the other advantages and disadvantages of a longer line length.


For me it's not just the length of the line but the level of indentation. I think wide indents are good for readability, so if you're using standard tabs and also sticking to 80 columns as a hard limit then you have only 64 characters to work with at two levels of indentation. Excessive indentation is a clue that you may want to refactor, but two levels is pretty common.

80 charcters starting at the point of indentation is more reasonable. At two levels of indenting, you might get out to 96 characters, which still fits comfortably in most GUI terminal windows.


W is the bare minimum, it’s a function and a conditional or loop, 3 or 4 is pretty common.


The biggest problem with short lines isn't even the line length, it's that it encourages stupid variable names in order to keep things on one line. Silly abbreviations that no one new to the code can figure out are the bane of my existence.


Why does code have to be "read like a book" to be considered "continuous text"? Of course code is continuous text.


No it's not: a book is generally read from cover to cover (at least fiction) while code is split amongst various files. One is free to read them in any order, which is the opposite of continuous text.


And like with files, you are also free to read chapters of a book in any order. But files have little to do with lines, no? Of course a function definition in every sane language is continuous text, even though you "are free" to read it backwards or starting in the middle.


That's for books.

When reading code, I'd rather see a single statement on one line, rather that break it at 2 lines because it passed some ancient 80 char limit...


Thank you. I was looking for this.

I don't care if a line needs to be 180 chars if it starts with log.debug. Just like searching, I read my code globally based on the first 10-20 chars. If I need details on a line, I'll read the whole line.

I do however put chained methods on new lines, because they are often a new statement.

And I use 4 spaces for indent, because my IDE is smart enough, it is in my programming language guidelines, and that's what my comapny uses.


I'd rather not have that statement more than 80 char long; that might be a one liner where many things are happening, in case there _is_ value is breaking it up, or maybe the variable names are too long, not helping readability.


Long variable names can cause long lines.

So can simple arithmetic that would be worse with intermediate values.


"ancient" doesn't mean wrong. Round wheels are ancient, it doesn't mean we should switch to the square wheels.


>Round wheels are ancient, it doesn't mean we should switch to the square wheels.

No, but it means we better use rubber wheels, not stone ones...


It doesn't mean right either, and the person you are responding to gave specific reasons why they prefer longer.


So it’s almost as if the “ancient” qualifier was useless.


Well it was clearly meant to imply that the reason we are using it is merely tradition, and that there was a reason a long time ago that no longer applies.


It was intended to mean "relavent in old times".


And we all use 80-column wheels because that's what the ancients used, by Zeus!

Oh wait, we use wheels sized for their use cases…



We also shouldn’t burn people deemed witches at the stake. Being ancient doesn’t make it right, either.


Even PEP8 says "Some teams strongly prefer a longer line length. For code maintained exclusively or primarily by a team that can reach agreement on this issue, it is okay to increase the line length limit up to 99 characters, provided that comments and docstrings are still wrapped at 72 characters."


> it is okay to increase the line length limit up to 99 characters

PEP8 is strongly biased towards making code readable, and has helped a generation of programmers start thinking about readability when coding. The line length limit, though, is one of the few parts of PEP8 that introduced defects as a practice. Holding to 80 characters causes:

* string concatenation bugs from breaking up long strings

* errors in expressions broken up into multiple expressions just to fit

* bias to use shorter variable names

My favorite part of PEP8 was the warning about "foolish consistency".


Are we supposed to be reading the PEPs?


If you care about the reasoning behind whatever feature they are implementing.


Comments suffer even more from line length limit. I saw it many times when a comment is changed and then all following lines get rewrapped.


> I take the even more draconian approach of wrapping to 72 columns when writing comments, after the fashion of PEP 8.[3]

Which also happens to be how Linus wrapped the lines of this particular email.


Plaintext email is wrapped to 72, that's in the RFC - Linus didn't have much choice there.


Sure did:

> Each line of characters MUST be no more than 998 characters, and SHOULD be no more than 78 characters, excluding the CRLF.

https://datatracker.ietf.org/doc/html/rfc5322#section-2.1.1

https://www.arp242.net/email-wrapping.html


I think I heard somewhere that the studies about line length don’t actually give very strong evidence in favour of short lines. On the other hand, I don’t have a source for that. Do you know if they are any good or is this just repeating common wisdom?

Some arguments in favour of longer lines would be:

- monospaced fonts are relatively wide so the line is not long in terms of the number of characters

- there can be a lot of non-length on the left hand side due to indentation (especially in Linux where tabs are wide)

- programmers are relatively literate and so might perform better than average at finding the start of the line

- editor features like line highlighting can help to find the line

- source code has a very ragged edge and so it is easier to find the start of the line than in dense prose.

- magazines or their readers prefer narrow columns for stylistic reasons. E.g. newspapers sometimes have very narrow columns that lead to intraword spacing that makes things hard to read


1) has tenuous relevance at best

2) The 80 or 66 char or whatever is literally a non-science driven designer rule of thumb that's actually the opposite conclusion from results of actual research into the issue (granted it's not exactly a deeply studied niche but there is research).


Arbitrary line breaking is ugly and a unnecessary source of errors. I avoid writing lines longer than 80 chars but in some cases it feels better and more ergonomic/readable to exceed it.


It depends. This is not an unbroken rule.

E.g. if you have 4 long lines all following the same pattern with minor modifications, the best way to both scan quickly and avoid bugs is to write them on their own line fully and align the common parts vertically.

Splitting each line into smaller parts for the sake of "vertical scanning" actually achieves exactly the opposite in this case.


Book lines are not indented at different nested levels while code is. So, unless your line limit is 72char+indentation, it makes little sense to follow this deeply cumbersome policy which will result in developers making single letters for all variables and 2/3 letter function/method names.


When "visually scanning" code, in 90% of the cases, only the first 10 characters matter anyway, so I fail to see your point. How often do you really need to grasp, say, all function arguments at once?


These opinions about readability "flow" using numbered footnotes strikes me as the height of irony, but that's me.


The main problem I often run into is not wrapping when reading code in an editor, but rather wrapping when reading side-by-side diffs, for example in a github PR or review board RB. You get not only double the code width, but also an extra 6 or so columns on each side for line numbers, plus margins and scrollbars; so 80-column text is really more like 180 columns wide that your window needs to be if you don't want any wrapping.

On my typical developer laptop a full-screen terminal window (which I personally never use) with a system default 11pt font is 202 columns. A modest bump to 13pt drops that to 176 columns, meaning I can't view a side-by-side 80-column wide PR diff without some line wrapping.

That's great that you have a 350-character wide display. Requiring everyone to have acres of screen space, either because they can afford a 34" monitor (or 2 24" monitors) OR because they have good enough eyesight to be able to use a microscopic font is excluding a lot of people.


I dunno; I don't use an external monitor, and code on a 1080p 13" laptop screen. At my normal font size, a full-screen terminal is 190 characters. That's fine; and I try to soft-limit my line width to 120 characters or so.

Personally I don't like side-by-side diffs; I'm fine with unified diffs, so I guess I don't run into that problem.

But still, I recognize that nearly everyone else I know codes on a larger monitor (you can get a decent, but not great, 27" monitor for $300 or so) with much more horizontal space, and I wouldn't ask them to cater to my small-screen preferences.


Not an IDE user but in textmode I really like using GNU diff -y for a variety of file comparison tasks. There is no word-wrap and it is readable on smaller screens.

For example, I can dump two files to single column hex using od and then do diff -y to find the bytes that differ. A nice complement to cmp -b.

      test $# = 2||exec echo usage: $0 file1 file2\; for small files
      od -An -tx1 -vw1 < $1 > $1.hex
      od -An -tx1 -vw1 < $2 > $2.hex
      diff -y $1.hex $2.hex|tr '\11' '\40'|sed '/[<>|]/s/  / /g'|less -Jj5 -NGp "[<>|]" 
      exec rm $1.hex $2.hex


You can deal with line breaks. Is it the end of the world?


Again a tooling limitation. Use github or horizontal diffs.


Look at your diffs in horizontal splits?


AFAIR, not an option for RB.


Adding horizontal diff support was the single thing that made me happy when my company got rid of RB.

I can't believe that in 2021 they still haven't added it.


I remember when this was posted. I agree with Linus.

Most monitors have been wider than taller for many years. Yet, people insist on putting silly horizontal task bars up and down, thus making the problem even worse.

My mind is blown on an almost daily basis when some of our junior developers shares their screen with me.

Two of them consistently have:

1. horizontal task bar

2. display scaling set to 125% or worse (1080 native)

3. a terminal launched from VSCode (whaaaaaaaaaat) they leave on the bottom

That setup leaves them with a total of 30 lines of code. Literally more than 50% of their desktop remains unused most of the time. I don't want to force my opinions on them but stuff like this takes its toll and triggers my OCD. Only it's not an OCD, I just can't see shit


What's wrong with 2? People can have troubles to read small text..


It’s the default on many laptops and is often completely inappropriate


What's wrong about it tho?


You can increase the font size if you have trouble reading. Scaling the whole UI makes everything bigger, including elements that you don't read. Also depending on the OS scaling can be implemented more or less well and efficiently.


Unfortunately, Mac (integer scaling) and web (with the takeover of 1px[*]) have driven this UI concept that everything has to be scaled up for High-DPI screens. And early Gtk+ 2.* versions would gladly take any font size and resize UI elements so the label fit.

[*] though in all honesty, 1px as a relative measure of angular size is fine, it's just very badly named — I was hoping web designers would switch to ems in early days of the web, allowing users to control everything through font size directly — but that would still require some smartness not to increase padding too much with large fonts: basically, we'd want something like flexible spacing a-la TeX's hskip/vskip/hglue/vglue but catering to the screen and nuances screens possess.

Almost no UI (including web pages) is designed to work well with custom font sizes today. It's a definite win of form over function, or at the very least, recognition that designers are unable to design truly fluid/responsive interfaces that they like the look of.

Gnome still allows you to make use of gnome-tweaks to adjust just the font sizes, but you still run into issues because spacing can be too small (thus I tend to run with 125% scaling and 1.4-1.5x font sizes on HighDPI screens).


If they're on a 1080p monitor, agreed scaling shouldn't be needed, but anything above 1440p (unless the monitor is absolutely enormous) really does need to be scaled at at least 125%. Otherwise you'll get closer and closer to the monitor to read it without even realizing it


1) Like 100% of the people whose upgraded to Windows 11 since it’s now mandatory. 2) Nobody’s does scale up for for fun. It’s to be able to see what’s written. Especially when it means getting under « 1080p » real estate. 3) Terminals in IDEs are buggy (at least VSCode and JetBrains). Vertical terminals in IDE are _super_ buggy. But terminal in IDEs are integrated with your current workflow. So you are basically saying that they should not use an IDE and … it’s an uninteresting debate.


> 1. horizontal task bar

That's something I honestly never thought about. I'll try it on the side for the next week, thanks for sharing the tip!

> 2. display scaling set to 125% or worse (1080 native)

Be grateful that your vision is still great I think. I have Hacker News on 125% because it's hard to read for me when it's smaller than this.


I finally broke down and got progressive lenses recently (one of my better decisions). However that means only a small portion of the monitor is in clear focus. So I also bought dedicated computer glasses (basically my regular distant prescription backed off by about 1 diopter) -- second best decision I made. Now I can finally go back to a non-zoomed screen.


> So I also bought dedicated computer glasses (basically my regular distant prescription backed off by about 1 diopter)

That's very interesting and something I never really heard about before. Thank you for sharing that, I spend a lot of time in front of the computer so dedicated glasses would make a lot of sense.


> 1. horizontal task bar

I use an horizontal task bar, out of habits and default parameters, I guess. However, I considered putting it on the side, haven't tried it seriously, but I do like having the title of the tasks and that would probably not be so comfortable vertically.

There are only so many lines of code my brain can handle anyway and I don't think the screen height is not the bottleneck.


Ever since Vista, I've set the task bar to be "XP" style. Small icons, and not grouping windows, ever. I don't get the point of needing to click twice to switch between explorer windows or whatever.

I don't know why the huge, pointless icons are the default.


I have my task bar autohide and haven’t used it in years. Seriously, I never see it. Between cmd+tab and Alfred to launch apps it’s just redundant.


3. VSCode has the ability to interact with the console (for example automatically attach a node debugger whenever you run something calling node in that console), so that shouldn't necessarily automatically be scoffed at.


You'd hate my display which is often in 9:16 mode :-)


Genuine question: What’s wrong with 3?

I mostly don’t need my terminal after getting docker up and running, and if I just want to view the output, I can always open it (Control + V).

And I usually have vscode and chrome open.


If you leave it there, it eats vertical space.

I usually don't like IDE terminals, they don't do anything that you can't do in a real terminal, but comes with all kinds of problems that wouldn't be accepted on a single purpose program. But that's my preference, yours is fine too. The only real problem is leaving stuff eating up a portion of the screen when you are not using.


They have one feature that I like, at least in VSCode terminal: if you click on something that resembles a compilation error (filename:line) it will open the editor on that file on that line. That work with every compiler/tool and it's useful.

Other than that, I don't like to leave my editor to switch to a terminal. Thus when I develop I usually use the VSCode terminal because it's more practical, to compile, run tests, use git (yes, I know that VSCode has git functionalities built in... I find the terminal more practical and fast)


> What’s wrong with 3?

You may have missed the "[that] they leave on the bottom" part of 3. It sounds like you don't leave your terminal open/visible unless you press ^v


Horizontal task bar takes up much less space.


> 1. horizontal task bar

That is what makes the most sense if you want to have as much information there as possible.


Linus is making the typical bad faith argument that people use 80 char limit because of hardware limitations.

No one has ever advocated 80 because of hardware limits. Hardware has progressed but our eyes haven't.

In prose, shorter lines help because you can easily go to the next line.

For programming, shorter lines help because you can navigate your function better because most of what you're tracking is in a smaller area. With long lines you have to move your eyes more.

Shorter lines also force you to write one param per line when calling a function. No one is saying you need to mk yr vrbls shorter.


> Linus is making the typical bad faith argument that people use 80 char limit because of hardware limitations.

Disagreement does not mean "bad faith".

The character limit is 80 because of hardware. The argument for using 80 characters is not "use short lines because it's easier to read". It's not "use 85 characters if it increases readability and understanding". It's "80 characters is a hard limit and if you go over we're gonna wrap or truncate"

Yes. It is 80 and not another number only because of hardware and history. It has never been about readability.


Arguably, the hardware was designed to support an adequate and sufficient line length.


> Shorter lines also force you to write one param per line when calling a function.

And this is unconditionally good why?

Linus isn't wrong when he describes lines as a natural unit, already operated on by our tools and our brains. Having to stitch multiple lines together has its own mental cost.

Is a longer line always better? Definitely not. There are absolutely cases where an expression split over multiple lines and well-indented can make code much clearer. There are also cases of the inverse, where you have to take a single atom that was split across multiple lines and re-integrate it while reading. (The patch that Linus is referring to here has some of these.)

Line length should be driven by some combination of total character width and expression cohesion.

A single line can be overly complex even far below an 80 character limit, and a single line well above that limit can still be easy to read.

Prose, and therefore comments, have different reading patterns and therefore different motivations for line length. You want to limit comments and documentation to 80 characters? Fine. Code? That's valuing one metric over another, or simply hoping that strongly clamping one metric is going to also improve another.


I used to be an 80 char zealot, but since working in 100 and 120 char codebases, I've seen the light. Such line lengths make it MUCH easier to scan the code because we are able to scan that much without moving our eyes around a lot, and now it's not so cramped.

One param per line of a function call is bad for readability (unless there are more than four params).


100 is my sweet spot, unless i do Java/groovy and in this case i'd rather have 120. I still like 80 char comments/docstrings though.


This reminds me of the thing with musical albums.

Albums are the length that they are because of the technical constraints of the medium: you can only fit so many tracks on a vinyl disk.

However, this constraint has been adopted by musicians as a creative constraint - an album is a "feels right" amount of music for an artist to produce. You can tell a story in an album, fit a narrative arc in that much space.

But, y'know, sometimes you need a double album to tell the story properly. Some stories need more room.

I like 80 lines as a guideline. It feels about right, and fits my eyeballs. But I'm not going to let that stop me if I need more space. Some code demands more space.


Also you have to view code in multiple contexts and sticking to 80 characters per line allows it it be viewable in more places. For example, I like to look at the diffs in github and it is more flexible to read in a web browser if kept to 80 chars.


Nesting exists, and 80 is an arbitrary number.


Every number is arbitrary. 80 is often used simply because it's in the right ballpark and already used by many who agree on short lines.

People who like long lines have no main limit they agree on and if a line limit does exist, it often varies by project... making it even more arbitrary.


The main argument is that it’s not in the right ballpark though - which has no universal correct answer. However at this point 80 is not arbitrary - It’s primarily used as a default because of ancient history when 4k let alone 640k was enough for everybody. A time when the expense of expanding a meager terminal buffer 50% was hundreds of dollars - and this goes back to ancient history of 80 column punchcards.


Doesn't Linus also regularly argue that you shouldn't nest deeper than like 3 or 4 levels?


    class X {
        function Y {
            for (i < 20) {
                if (array[i] == 17) {
                    // just 64 characters left now


The patch rejected by Linus is a good example why you want "long" lines: the argument list of a function was broken into two lines, with the second line only containing a single function argument. While I can see that overly long lines can reduce readability, there is a lot of value in keeping function parameter definitions and functions calls in one line, if this line doesn't get overly long. We are not talking about written text like this comment here, but about program code, which has a very specific structure.

For comment blocks, I try to stick to roughly 80 chars per line, but that then is text which can be broken up easily. For program code, I try to stay below 100 chars but exceptions can be made, if breaking up lines would be detrimental to the structure. With very complex function calls though, one would switch to multi-line breaking, sometimes with one argument per line (and a posssible comment).

This discussion makes me wonder, why there are no editing modes with "smart" line wrap. As in being syntax aware when wrapping long lines. A function call which extends the set limit, like the 80 chars, could be automatically displayed as a multi-line call, properly indented. So independant of your terminal size, you could always see nicely looking code optimized for that terminal, without imposing your perfect view size onto others.

Of course, this is easier to do with languages, which have a very strictly agreed upon code style. Go for example would make this easy, as all Go code usually gets autoformatted by gofmt. Lisp would be another language suitable for this, as the syntax makes this easy and overall code-formatting and indentation is pretty universally agreed upon.


I would go further and advocate that the editor alone should do line wrapping as a viewing help. It would show up as smart and syntax aware wrapped code.

But the actual code should be unwrapped. This would also cut down on format discussions.


I write long lines from time to time, but only because I've messed up and made my subroutines into magical do-everthing monstrosities that need a million arguments. It seems to me that (while there are exceptions) the real problem is the existence of lines that want to be long, rather than things like whether the language supports line breaking lines inside a function call or whether your terminal has enough columns. It is unavoidable sometimes, but if we're trying to reduce the burden of figuring out what's being passed to a function, that would seem to indicate something has gone completely wrong on a deeper level.


In proper typesetting, "widow" words or lines are taken care of with fluid spacing and rewriting. It's a small science in how best to approach that, and you basically take the entire page/chapter into account to avoid them. Rewriting stuff to get rid of them is perfectly natural. I don't see why we don't care about readability to the same extent when programming.

When typesetting code (which is what we are doing when discussing line limits), natural questions to ask are:

1. Why have a line length limit?

2. If there are reasons to have it, what should it be?

3. Should it be in characters or actual width?

It seems that there's no (big) disagreement that we should have a length limit on lines in code. If you are disagreeing, none of the stuff below matters :)

When it comes to what it should be, a limit of 80 has proven to be sufficiently long not to impede development, and is likely just good as any (you get benefits like multiple side-by-side-by-side windows on wide screens, yet you need to break lines more often; you need to factor your code more carefully to avoid too many indentation levels, yet you can't use sufficiently expressive symbol names....).

If you decide on any particular limit (100? 120? 400?), you'll still have cases where you need to break up long lines. The tools argument (grepping and such) is thus rendered moot, because you've got all the same problems, just less frequently (and that's probably scarier, because it'll be easier to miss something when refactoring).

Finally, the question of monospaced fonts is a curious one: unfortunately, I know of no code editor that works well with non-monospaced fonts, so I've never seriously considered not using them, but the only place where I truly care about fixed-spacing is indentation. And since that renders equally in non-monospaced fonts, I'd be willing to get rid of them too ;-)


In my experience proportional fonts look and work better than monospaced in all editors I use (including email clients, messengers, browsers, ebook readers, dictionaries, file managers). In general I would say anything you can say about preferences will be overriden by habit.


> The patch rejected by Linus is a good example why you want "long" lines: the argument list of a function was broken into two lines, with the second line only containing a single function argument.

I would chop the arguments, putting a single one on each line.


That would be a good alternative, especially if you use the space for documentation. I don't think there is one right way to do it, but certainly just wrapping arguments as if they were flowing text was the worst :)


Flowing text is almost never wrapped that way: I make a reference to handling of "widow" words above, and they're very undesired in typeset text.

Getting to know those simple gotchas about where to avoid line breaking can help with both prose and code writing.


Most of my professional life has been 80 columns, two spaces for indentation, and an auto formatter. I love it: on a 27" monitor I can have four 81-column terminals side by side, with room for a fifth column wide enough for commands. Being able to see many parts of the codebase side by side is incredibly useful.


I’m in the same boat. I’d rather have more views than longer lines. Currently have a file tree, three editor columns and one terminal column.

Keeping to a shorter line width makes me write shorter functions with less nesting. In my opinion that leads to higher quality code which is more readable (your mileage may vary, I write Erlang which is very terse and expressive which fits that style very well).


This needed to be said! What bothers me most about the 80 character mafia is not just the line-breaks, but also the insistence on short cryptic variable and function names. Good code is wide code.


Good code can be wide code. Doesn't have to be.

And code editors are decent at wrapping lines nowadays, if you like it.


That's quite curious. I'd probably be someone you'd consider a member of the "80 character mafia", but I am also pretty insistent on long, descriptive variable names.

Good code is readable code (so even if it's bad code, it's clear why it is so). What you lose with 80 character limit, you also win just as much.


adjustFooSOItFitsIntoTheSpaceAvailableIfNeeded

What gets me is long verbose names.

noun-verb. Short names. They are mnemonic not descriptive


A docstring can be used instead of making a many-words sentence-long name.

IDE can show it automatically when you focus on the name in the source code.


Sometimes a variable is for checking whether a value is kerfuffled by jingleberry so isKerfuffledByJingleberry can be a good name


That is, roughly, verb/noun

It also depends on how wide the scope is. "i" for an index variable is often a good choice. For the global count of eyes, "i" is a terrible choice.

So in a limited scope I would tend to: kerfuffled:bool


>noun-verb

fcntl


Actually I prefer verb noun. I misstyped.

goDisplay() rather than initialiseDisplaAndEnterDisplayModeIfEverythingIsOk


I just calculated the 90th quantile of line length over the TXR project, for all the C, Lisp, Lex and Yacc sources:

  1> (let ((q (quantile 0.9)))
       (each ((l (flow "git ls-files '*.c' '*.h' '*.tl' '*.y' '*.l'"
                   command-get-lines
                   open-files
                   get-lines)))
         [q (len l)])
       [q])
  63.9840684046716
The project follows two character indentation except in one inherited source file and its header.

The 99th percentile:

  2> (let ((q (quantile 0.99)))
  [ .. SNIP ]
       [q])
  78.9882857548558
Of course, that's skewed by the internal pressure to stay within 80 columns.

When we go to the 99.99th percentile, things take a bit of a leap:

  3> (let ((q (quantile 0.9999)))
  [ .. SNIP ]
       [q])
       [q])
  104.790549931036
Still, that shows us that long lines are quite rare. Probably the column limit set by the 99th percentile is reasonable from this perspective.

Moreover, if we want to be quite generous, we don't have to add too many columns. 105 characters will only break 1 in 10,000 lines. Something can almost certainly be done about those. Or these outliers have something wrong, like being accidentally joined or something.

--

Let's tidy up the code, by eliminating the each iteration with mapdo:

  1> (let ((q (quantile 0.9999)))
       (flow "git ls-files '*.c' '*.h' '*.tl' '*.y' '*.l'"
         command-get-lines
         open-files
         get-lines
         (mapdo (opip len q)))
       [q])
  104.790549931036


As you note towards the middle I think you're just measuring that code styles traditionally limit to 80 columns not that 80 columns was really the natural 99th percentile of freeform code lines. Similarly I don't think that shows 104+ character lines are really 1 in 10,000 it shows 1 in 10,000 lines weren't yet refactored to fit into the 80 columns per line standard.

I'm all for >80 column wide lines I just don't think these stats really tell us anything about it other than 80 column lines have been the standard in the past.


I was going to ask if someone did an analysis of the sort! Kudos!

It'd be great to get a distribution graph as well, even if we can't know how long would the lines have been without a line length limit.

My gut feel before reading your comment was that:

* I like short limit because majority of the lines _are_ short

* Shorter line length limit allows me to have more windows side-by-side-by-side, without wasting whitespace on the right side

* I am willing to accept occasional awkward line break to achieve the above

The question I had in my mind was how "occasional" was it, so thanks for answering that to a large extent. We could also do a deeper analysis to re-format all lines not to use line-breaks before running the numbers too.

Basically, if 64 was the natural limit for 90% of the lines (sure, some are this short because of the 80-character limit), then a 20% buffer should be good enough.


Does this account for blank lines?


No; we can introduce a (remqual "") element or similar to the (flow ...) pipeline after get-lines to achieve that. The numbers become: 66.06, 79.61, 105.89. Different, but not by much.


I basically agree. There is a point where lines get "too long" but it's more than 80 (or 72) characters on modern systems.

But if I'm working in a code base where all the lines are broken at <80 characters, it makes sense to stick with that rather than introduce inconsistent formatting.

Bottom line, your brain will adapt to either style without a lot of fuss, and consistency helps.


Just a note: books, magazines, slide decks (and more*) aren't getting any wider. As time goes on, and code and command output gets wider, it's getting a bit harder to write educational materials. I have slide decks and books that have "here's the left hand side of iostat -x" then "here's the right hand side." With the latest sysstat/iostat version, that's only getting worse (it may need three parts). That's fine if it's truly necessary.

When I develop new tools, I try to keep the default output within 80 chars if possible -- not because of terminals -- but because of all the other locations the output may appear.

* also for jira tickets, github threads, random blog posts, etc., for the output to stay on screen without fiddling with a horizontal scrollbar.


The linux kernel doesn't have an autoformatter script that stops stupid conversations like this, where certain code files can look different than others?

One of the most productive policies you can do to a company is make autoformat mandatory and move formatting discussions from specific commits to discussions about the formatting rules themselves. If you want to make a change in formatting, you have to write a rule to make it done automatically at the very least.


Didn't people in the news-printing business eons ago do studies on large lines and readability? Why else are there multiple columns in so many papers? Does anyone have a good reference for this? Personally, it makes sense to me, since I start to get lost reading long lines.

Plus the whole modern hardware comment is not well founded, since I like to put some monitors in vertical orientation.


But remember that it's a huge difference between reading compact text in a newspaper or book compared to reading source code. I don't remember get lost reading long lines in source code ever but it happens regularly to me when reading books.


One of my favorite parts about reading on a computer is that I can highlight text with the cursor as I read it. Just dragging down a line at a time while reading means I virtually never lose my place while reading even very wide format text on a computer.


I do a similar thing when reading paper books using a digital system much older than computers.


I have no reference for this, but I understand that a significant limitation in the news print industry would be what the Linotype machines could produce. They literally produced a Line-Of-type, cast in lead. You couldn't practically manage slugs of lead that are much longer than the typical newspaper column width, and so they ended up the width they are.

And prior to Linotype, there would have been practical limitations to the forme size that could be easily managed and quickly re-set to fix typos.

So I don't think the newspaper column width was the result of centuries of experimentation into readability. It Just Happened. (Same as the gauge in railways)


While arguably shorter lines were easier to set and maintain by hand, both hand typesetting and Linotype machines could certainly produce lines of text longer than a typical newspaper column -- and they did: look at any book without columnar text produced during the same period!

I think you're right about it being a way to deal with a physical limitation, but not of the physical type -- rather, of the physical paper. If you divide the page vertically into two or three columns (for magazines) or six to nine (for broadsheets), you get an extremely flexible grid to use for fitting stories around one another and around photos and advertisements. Text could be a single column wide or two or even three columns across; headlines can run across just a column or two or across the whole page; you can always find somewhere to put that story that continues from page 1A.


"Plus the whole modern hardware comment is not well founded, since I like..." That is exactly the essence of Linus comment. If some people choose to do something that does not mean the rest needs to accommodate. 80 chars limit is just ridiculous in 2021.


I dont think its that cut and dry but i find 120 which got popular with githubs viewable columns to be pretty great.


Code is drastically different from prose, as code has a lot of repetition. For example:

  some_long_variable_name = 42
  new_value = description_function_name(some_long_variable_name)
  do_something_with(new_value)
All of the above has 5 unique tokens, several of them repeated. Unless you’re using lots of global variables, most of those tokens will be scoped entirely within that function, so you can also infer an awful lot of meaning just from the context.


They were concerned about the ease of editors and typesetters making revisions. You can reformat shorter lines with less effort. It also affords more opportunity to insert ads.


I don't think you can extrapolate from prose, in which line breaks are meaningless, to code which is clearly very different.



The best world would be if every language had a decent formatter so everyone could view code as they wish and it was checked in at some standard width for a company. Or, a world where git didn't see a formatting change as a code change if there were no change in logic; where it sees the AST rather than the textual content. Then, I could use whatever width I want and you can use your preference, and neither of us have to care about the others settings.

Go further, and you could even have a file to map one or more variables to a different name, so that is someone prefers "avg_val" and you prefer "average_value", then you could both see the different name, although that starts to get dangerous as you discuss the same code with different syntax but the same semantics.


Well, Linus needs the extra width for his 8-space-hard-tab indentations. At least in all the situations when he breaks his own rule that you shouldn't indent more than two or three times, but write functions instead.

Kernel coding style guide:

"Now, some people will claim that having 8-character indentations makes the code move too far to the right, and makes it hard to read on a 80-character terminal screen. The answer to that is that if you need more than 3 levels of indentation, you’re screwed anyway, and should fix your program."


I can 3-ways diff merge with three windows, side-by-side, each 120 characters wide. That is fine with me.

When I write my own code I try to not get after the 100th column on the line though (and that is never 100 characters per line with all the indentation going on).

But in case I stumble on code with long lines, I'm still fine.

One should also not forget that some language some of use are forced to work with are incredibly verbose (a combination of culture, lack of type inference, the existence of every and any modifier under the sun, etc.).

    ECPrivateKeyParameters privKey = new ECPrivateKeyParameters(privateKey, Sign.CURVE);

That's 84 character for you (not even counting leading indent characters) from BouncyCastle, the Java crypto library. Maybe you'll even want so slap a "private" modifier in front of that line.

And that's not an hardcore example: this is relatively soft.

Now thankfully, like Linus, I don't think 80 characters-wide code makes any sense at all whatsoever in 2021 (not that I especially like verbose language btw).


I usually target around 120 characters as a right margin, allowing up to 130 before I start seriously considering a hard line break.

Anything more than 140 or so and it gets harder to scan (too sparse). Anything less than 100 or so and it gets harder to scan (too dense).

80 is just painful.


This is certainly subjective. I agree that 80 characters are often too constraining, but significantly more than 100 characters is already difficult to scan for me. I find a limit of 120 to be too much; around 100 is the sweet spot.


    I do get
    tired of reading code that isn
    't formatted in
    a
    clear way because the autoformatter is
    a slave to the column limit and
    especially in the case of 1. lists
    2. instructions. 3. Other things that
    should line up--is overzealous
    in breaking up lines and making the
    code as unreadable as possible.


Curious if 132 characters per line would work for most people.

I don't mind going past 80, but would still want some de-facto standard, and there's a fair amount of precedent for 132.


Based on experience and experimentation, I advocate 110 characters as a hard limit (i.e., would reject a change based on line length alone) and 90-95 as a good guideline.

The rationale for the 110 is that above that you cannot avoid wrapping in a side-by-side diff on a reasonable screen+font+layout. 90 is where you should be asking yourself "is this cleaner to split across multiple lines?" and still allowing some headroom to make minor changes without needing to reformat things.

There are of course some things where it is impossible to constrain things to shorter lines (e.g., markdown tables) but those are definitely the exception.


Agreed, personally I find 120 already too much.


What's the 132 precedent?


Many line printers back in the day did 132 columns. As a result the popular DEC VT-100 had a 132 column mode to allow proper viewing of the output of programs without having to waste paper. It wasn't that great and was sort of hard to read.

As part of VT-100 compatibility, many subsequent terminals also supported 132 column mode.


Others have mentioned printers and older terminals that supported 80 and 132 column modes. That bled into other things. Like, for example, Gnome's terminal app has a couple of 132 column settings as selectable presets. Also, the framebuffer/vga console in Linux (and probably other OSes) can be set into a 132 column mode. It's an official "VESA mode" also.


Back in the day some terminals had a choice of 80 or 132 column modes. Since obviously the window (screen) didn't change size, this was achieved with a narrower font.


Monospaced, right?


yes, but using fewer horizontal pixels.


You used to be able to run old terminals in either 80 or 132 column modes.


GNOME Terminal's four presets are: 80x24, 80x43, 132x24, 132x43

I always kind of figured "132" was some arbitrary programmer decision, never knew it dated to old terminals. At least I recognize 24 and 43 as the number of lines old text modes (on, eg, VGA) had.


Some old SVGA cards from the 1990s had a ~132x60 mode (as well as a bunch of others), that could be activated in DOS with the RHIDE editor that came with DJGPP (no doubt there were other ways), or with SVGATextMode in Linux.


This.

I know the DEC VT340 was capable of 132 columns, as was an Apple II with a Videx Ultraterm card and a suitable (hi-res) monitor.

There are probably even more examples, but those two are what came to mind first.


Ahhh, the good ol’ DEC VT-240 CRT.


I feel as if I was a christian and God himself had just materialised in front of me and said that he doesn't exist, after all.

An utterly confusing situation.


As I think I said in a previous entry, I use line breaks to separate "logical" instructions, whatever length they have. This is very common with chains (like the typical split.filter.map.filter.join). This also helps with comments, because you can put a comment above the corresponding line of code. If the comment is long and involved multiples explanations, maybe you need to try to split the code into multiple lines, for example a function call where each parameter is something worth explaining. Other times the are a lot of syntactic useless code that is required only to wrap the important part (looking at you, Java) and often I split it so that the important part is the one on a separate line, something like:

    // Get the ids from the data
    var ids = object.stream().map(object->
        object.getName() + object.getType()
    ).distinct().collect(Collectors.toList());

Word wrap is the one that automatically adapts the code to my screen, not the other way around.


    // Get the ids from the data
    var ids = object.stream()
                    .map(object ->
                         object.getName() + object.getType()
                    )
                    .distinct()
                    .collect(Collectors.toList());


It's surprising to me how software developers, supposedly very rational, cannot come to the obvious conclusion that the correct coding style is to have unlimited line length and using tabs for indent.

The reason that's the best choice is that it's the approach that provides logical data rather than a specific visual style, allowing people can just configure their editors to do word wrapping to whatever line length they prefer and set tab width to whatever value they prefer (including fractional values!), so it's strictly better than all alternatives.

And if someone doesn't like the editor word wrapping, they can always write an extension or patch for the editor to wrap in whatever syntax-aware way they want.

Also, while for some other bizarre reason it's traditional to program with monospace fonts, there is in fact no reason at all to use them instead of the more efficient proportional fonts, and if you use proportional fonts the concept of a length limit is meaningless.


> The reason that's the best choice is that it's the approach that provides logical data rather than a specific visual style, allowing people can just configure their editors to do word wrapping to whatever line length they prefer and set tab width to whatever value they prefer (including fractional values!), so it's strictly better than all alternatives.

But often you _do_ want to provide a specific visual style with your code. Vertical alignment is a powerful tool for communication and readability. Hence monospace and space-based indentation.

Also, word of advice: when a disagreement has been running for decades and involved thousands of people on both sides, casually describing your conclusion as "obvious" and belittling anyone who disagrees as "[not] very rational" signals superficiality, not intelligence.


I'd say your maximalist view is not maximal enough, by requiring tabs. Why not store code in the most minimal way possible (one-liner, no whitespace [assuming whitespace-insensitive languages]) and let viewers and editors wrap and indent it however they want.

That aside, I'd say the reason your vision is not achievable today is due to lack of great view/edit tools.


For a less-maximal outlook that is actually achievable with minimal editor support, check out elastic tabstops.

https://nickgravgaard.com/elastic-tabstops/

I've found that the last reason I can justify spaces has to do with hanging indents which are not aligned on tab bounaries. Elastic tabstops handles those with - you guessed it - tabs.


Thanks for the pointer. Elastic tabstops is a cool idea.


Also variable tabs! instead of being

    {
        {
            {
                {
                    {
                    }
                }
            }
        }
    }
they could be

    {
        {
           {
             {
              {
              }
             }
           }
        }
    }


"No reason to use monospace fonts at all" is a stretch. It can be nice to line things up easily.


Thoughts:

- prose and comments are greatly improved with shorter lines (I switched to reader mode to read the post)

- a logical code line often exceeds optimal prose length (70-80), due to indentation, syntax, language idioms etc

- code is more sparse and much less visually homogenous than prose, which means code can have longer lines without causing "staccation return issues"

- it is technically easy and non-destructive to auto wrap in-editor for display output (long source -> shorter display)

- it's ambiguous and hard for an editor to "unwrap" (short source -> longer display)

- grepping source works better with less wrapping

- diff tool UIs seem to struggle with long lines, today

- some single identifiers are commonly longer than 80 - such as URLs

- comments and code doesn't have to have the same max length

Linus got it right in this case imo. In the future, I'd be curious about ubiquitous and standardized auto-wrapping in editors so that you can set whatever length you damn please without imposing it on everyone else.


I largely agree with this. However I also tend to have short lines. For example if a function takes more than a couple of arguments I put each on its own line. I like to keep a few number of "ideas" on each line, ideally one or two.

What I don't worry about is long function names or similar. I won't wrap a line because a function name is really long or similar, as I think that doesn't generally add more complexity to the line.

That being said I think soft-wrapping of code is very lacking. The best I have seen is matching the indent of the previous line. It would be neat if it preferred breaking on separate arguments, operators etc. I guess this is now moving to the realm of code formatters but now you are back to hard-wrapping which loses the info about the target screen size.


I follow the rule of 80/120. 120 wide chars as the hard limit. 80 wide chars as a soft limit that I not should surpass.


I liked this response in the thread related to blind contributors and braille displays.

https://lkml.org/lkml/2020/6/5/685

To summarize, this blind user at least doesn't consider 80-characters max width to be a supportive argument. They are more worried, instead, about matching up the terminal width to multiples of their display. And of course, to avoid forced guis.

Sometimes I think it's easy to make arguments about accessibility without actually consulting someone with an accessible need.


You should also read the response to the response. This is not an issue.


I'm confused. I thought I did. Is the link above not correct?

Nicolas Pitre wrote: "Well, not really."

Is that the "response to the response" you're referring to? If so, that's what I meant as well. I thought Nicolas' response was on point with the discussion about accessibility.


It's amusing to me that the article body seems to be wrapped at 80 chars.

I think all of this stuff is an art not a science but I'd rather people didn't go much over 100 chars, it becomes really difficult to read and I'd argue in most cases there should probably be descriptive names given to parts of a long line in the form of variables/constants. Maybe this doesn't apply to kernel development but I love to name things really descriptively rather than giving things short names or building clever one liners.


Natural human language is taken to be the most legible at 9 - 12 words per line (in non-justified running text). Which is pretty much what you're seeing there.

Plus, you're most likely not going to check this conversation in into git. And you're probably not going to pipe it into grep. :^)


Short lines (about 72 chars) are easier to read prose. But most code is tabulated, and is much less dense than prose.

I tried a few different thresholds over the years and I finally chose 100 chars. It is a good compromise, wide enough to not be obnoxious and enabling long names.

And still narrow enough to display two columns on a single wide screen when needed.


The asinine argument at the heart of that rant is that we should format our code so it's more greppable. Let's not consider making a better/semantic grep that can match code. Let's all instead insert strategic line breaks in our code so that the current grep sort of works on it.


Semantic line wrap serves the same purpose, and can be done at any line width when viewing.

Even the simplest improvement in line wrapping of maintaining the leading indent is pretty much enough. Once I started using Atom and then VS Code, which both do this, I pretty much stopped worrying about line length except when it was semantically confusing (like many arguments to a function where I use newlines to separate the top-level arguments from the expressions).


Linus takes a seemingly opposite stance when discussing word-wrapping commit messages:

https://github.com/torvalds/linux/pull/17#issuecomment-56611...


Nothing but coming sense and I'm surprised I didn't didn't read this first. I fully agree and I'm surprised he would advocate that even for kernel development (where sometime you do need to work whichever the basic terminal format is, right after boot)


I feel that having longer lines means you can scan through a program more easily, by quickly scrolling through it vertically. Then once you're at the part you're looking for you can start scanning the text horizontally for more details.


The problem is if you have to scroll horizontally. Either you have to size your terminal or editor window very wide, wasting space that could be better used for other components (project tree, debugging views, other inspector views, documentation, …), or you can’t see the complete line at once. IMO the maximum line length shouldn’t be much more than twice(-ish) the median line length (not counting empty and brace-only lines).


Ironically, if the text on this page used the correct HTML elements instead of cramming everything into a monospaced code block—which results in browsers not wrapping long lines—this email would be readable on narrow displays like mobile phones.

I get the argument that you shouldn’t bring down the experience for everyone just to accommodate the most limited imaginable scenario (80x22) when it comes to breaking code lines, but it’s pretty easy to just replace <pre> with <p> in the context of the LKML listings.


I wrote some code (I'm an engineer) that ended up being the core logic of a HW driver that was linked to linux. One of the actions by the SW dev team was to reformat it to 80 chars wide. Made it less readable by a long way, even spilt up strings into multiple sections, if they heaven forbid, ran over the golden limit. They assured my it was a requirement to be upstreamed. It does not seem to be based on this, so who is pushing this 80 char standard?


Projects should have a formatter, and engineers should feel free to format files however they want when checked out, and then format them according to the project format when checking them in. All this about saccades and "code isn't a novel" makes me want to take up carpentry.


THANK GOD. This convention has always annoyed me especially in cases where I was told to shorten variable names (sacrificing readability) or where a line of code was clearly more legible even up to 120 characters. Finally, I can point out that there is a Linux wizard out there that disagrees.


I mean, should I write very long comments now packed into a single line? That's silly.


My personal guide boils down to this:

100 soft/best effort limit, 120 hard limit (with exceptions), 150 absolute limit (rarely used), 72 limit for docs/comments.

Auto-formatters usually set to 120.

More than 120 does come up, but very rarely - e.g. long URLs in documentation.


> for things like "grep" both in

I don't think using grep is a adequate development technique in 2020.

> "80-column terminal"

Sure but code breaching idk. ~120-200 Lines is still supper annoying and much harder to read.

> checked, and my main one is 142x76 characters right now,

Which brings us back to the 120-200 Line length limit.

Given that most IDEs have some side panels 150 Lines should be the max. instead of 200.

> 80x25 is really really limiting, and is simply NO LONGER RELEVANT

True, it's now 120-200x~50 but nice scrolling so it's more like x100+.

> But still - it's entirely reasonable to have variable names that are 10-15 characters and it makes the code more legible. Writing things out instead of using abbreviations etc.

I would go as far and say it's recommended, auto-completion is a thing and typing speed should never be your limiting factor (*).

> we do use wide tabs,

Is that 4 or 8 spaces?


grep rocks. Still.


I dunno. Don't we have yet to resolve where the curly braces belong, or emacs vs vi, or mac vs pc, or red sox vs yankees, or at least what materials to use for the staff bikeshed?


I also don't like excessive line breaks and prefer longer lines. No way 80 is a good idea. I'd say even 120 is too short.

So what is the recommended style for line length in the Linux kernel?


Getting used to a new style is like wearing new shoes. It takes a day or two to break in, before it feels natural.

Most arguments can be translated to "because I like it this way".


There’s nevertheless a bell curve. Hardly anyone would argue fir a 1000-character line limit.


Definitely. I thought I mentioned the "within reasonable boundaries(80->150)" and evidently I have not.


https://reddit.com/r/linusrants for a few more Linus stories


As I get older my screens get wider. I’m currently using one that’s about 22” wide, which isn’t far off:

  (80*2+ui+browser)*legible-dpi
  == age/2


It's funny that the letter itself uses 70 char lines


Maybe add "(2020)" to the title?


I thought that for a change Linus wrote something remarkably noncontroversial. Turns out I was wrong.


Not the first time he's been loudly wrong. It really depends on the semantics of the code in question and not on the absolute line length anyways. I've seen 30 character lines that are hard to parse, I've seen 120 character lines that I don't really think about at all.


Now do cpython c source


I completely agree with this. I am considering an ultra-wide monitor(the only thing really missing from my home setup at this point) but even at a 1920x1080 on all my monitors, 80 characters occupy at most 25-30% of my entire screen. The rest of the screen is just sitting there, being all "OK, black pixels all day long, gotcha boss". 130-140 characters is perfectly usable for any 1920x1080 screen which is what most people have. My old laptop is 13 inches fits those perfectly and still leaves plenty of unused space, even in the context of an IDE where you have your directory structure on your left for instance. Not to mention something like vim where a vertical split would still leave plenty of space laying around. I think most standards need to be updated and ditch the whole "Someone is using an 800x600 CRT monitor and they need to be able to work". I am sorry, no, I used to have such monitor in the 90's as a child from eastern Europe in an average income family. 2021, I cannot name a single developer with anything less than HD. People should seriously wake up and revise those standards-they are the equivalent of wearing the same underwear for 30+ years. Even side by side diffs work perfectly fine in those scenarios. 3 way diffs might be an issue on a 13 inch laptop but honestly I don't know anyone who actually likes working with those(one notable exception but I'd definitely disregard his opinions on anything to be quite honest).

Side rant, I have a lot more issues with people writing functions that are 500 lines long and contain 10 individual sets of logic which would make sense to be fragmented in different functions. As a matter of fact I had a fight over this at my now old job. Ideally the body of a function should fit in one screen: ideally 60-80 lines. In fact there is a study on the subject described in "The Psychology of Computer Programming" [1]. If it's something larger, in 90% of all cases you can segment your logic in a way which allows one component to be responsible for one specific thing and take that out in a separate function. With good naming, this makes the code infinitely easier to read 6 months down the line when you have to figure out what someone else or even you have done. The argument I got was "Well what if I have an ultra-wide monitor set up vertically, then it's not a big deal?". Fffs, are you stupid??? What if I start coding on my 4k projector and I can fit 1000 lines in a single screen? Fetch a vector from one place, iterate over it, populate a hashmap, compute some standard deviation, save a backup on disk, check the bounds of the values, store them somewhere for future reference and call an async function to a message queue, check if a set of conditions are met and have 8 different fragments of code handling those based on the conditions that were matched. Function name: fetch_history_data. There's a lot more than fetching history data in there now, isn't there? End of rant, I think I had to get it out of my system.

[1] https://www.goodreads.com/book/show/1660754.The_Psychology_o...


> I am considering an ultra-wide monitor(the only thing really missing from my home setup at this point) but even at a 1920x1080 on all my monitors, 80 characters occupy at most 25-30% of my entire screen

It is common for people to use split-screen setup (two files / editor buffers in parallel). On common 1920x1080 screen with 10px-wide font, you have 2x96, so something like 80-90 columns is ideal.


Monospace at font size 9 is 272 characters per line.


Yes, but that would be 7px-wide font. Even in 80s the default VGA font was 9px-wide and that was considered important improvement over EGA 8px-wide. And that was on monitors with lower DPI that is common on 1920x1080, so i would assume that 10px-wide is minimal ergonomical size for many people (on standard 96 DPI).


Amen




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: