Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Design Better Data Tables (medium.com/mission-log)
124 points by sebg on March 6, 2017 | hide | past | favorite | 29 comments


I disagree strongly with only mentioning the units in the first row.

If all units in a column are the same, mention that unit in the column header. Otherwise, you need to mention it everywhere. Only putting units in the first row suggests that the first row is different. It is basically lying in all the other rows, presuming the reader catches the error and understands what is meant.

I also think disagree slightly on color. Restricted use of color makes sense For example red for negative values, or green for highlighted values that you want to stand out.

Heck, if you have very geometric 2-D data, perhaps there is even a point to be made to color the background according to the value so you can 'see' the structure. This is pretty niche though.


> I also think disagree slightly on color. Restricted use of color makes sense For example red for negative values, or green for highlighted values that you want to stand out.

I would argue that while you can use colour to help your documents should be

* Perfectly readable when monochromatic

* Fully understandable without the colour

I would be concerned if you need to add colour for your data to be understood, and if you add it then it should be done with the understanding that not all of your readers will be able to see the same differences as you can (and you might be causing issues by altering things like contrast).

My only other main comment on this is similar, I want the communicated information to not rely on anything visual in tables. Bolding some results to indicate that they're significant is fine, if that information is available in another way, as I hate converting a format and losing information.


On color, I like to highlight slightly the row, or column depending on the data, I'm looking at by changing slightly the contrast.


author here! i think putting the unit in the header also works just fine. it's the same "ink" either way, so it comes down to particular usage.

color-wise, one of the strongest reasons to not use color is the exact situation you mentioned; in some countries (namely, china), green values are negative and red are positive.


Some before/after shots would have been nice to get the point across.

The biggest abuse of tables I see (and have often been the implementor of) is showing too much information for too many workflows that are slightly related. I'm sure most of us here have seen what happens when we add "just one more column" over and over again until is becomes that giant abomination with dynamic sorting (some even do it client side) and a million filters that force people to click 20 times just to do their job.

I think it causes a lot of maintenance issues as well, it's much easy to work with many smaller and more task focused displays than it is to work with the big "do everything" ones.


This works really well to illustrate most of these points: http://i.imgur.com/ZY8dKpA.gif


non-gif version: https://speakerdeck.com/cherdarchuk/clear-off-the-table

And the rest of these: https://speakerdeck.com/cherdarchuk (Pie Chart is my personal favorite :))


I found clients love having their questions in form of 'Y' in leading columns before the data. Such as, `Products shipped more than 250,000`, 'Products still in stock', 'Products in transit'. It allows them to quickly filter the data more tailed to what they're interested in.


The article states "One unsubstantiated opinion I have about rules is that zebra striping is bad. Really, really bad. Take it or leave it." - can someone elaborate?

For me, when reading something with data tables, a common use case is to look up a particular value in the table, and in the case of large tables (such as the baseball statistics example given right above that quote, https://cdn-images-1.medium.com/max/2000/1*71B5i6rZMMsryN0pD...) it's very hard to reliably do so, and zebra striping or some other solution is necessary to ensure that you're reading the right value for this team, not the one above or below that.

I would consider a guideline that makes tables more beautiful but less readable (as measured by purely functional metrics such as speed and mistake rate when reading them) as deeply flawed, putting the cart before the horse, so to speak.


Yeah, I have no idea why the strong stance. It seems to be so strongly rooted in a minimalist perspective. If you want to not have zebra striping on by default, I may be okay with that if you have it on during hover. In this paradigm I also want it both directions like a giant +.

Another thing not discussed is the problem with big tables. If you have to scroll at all, you lose the headers and "label" in column A. I like it when things scroll only the "data" part but leave these fixed and always visible. It would also be interesting to see a replication of the header as a footer and an additional column Z that duplicated column. This helps you not have to move your eyes back and forth as you look at the data and scroll around.


Something that strikes me here: Coming from a european standpoint, using commas as a delimiter for 3-digit groups seems like a very silly decision. In the german speaking regions, the apostrophe is generally used for this - which reduces the probability that someone could mistake it for a decimal mark.

On the other hand, the Germans seem to use commas as decimal marks, which is even sillier. Both of those habits combined can cause much confusion if you're not sure where the data comes from...


>On the other hand, the Germans seem to use commas as decimal marks, which is even sillier.

Yes, it's a bit silly that there are multiple decimal marks used, but why is the comma any sillier than the dot? Both are very widely used https://en.wikipedia.org/wiki/Decimal_mark#Hindu.E2.80.93Ara...


From a traditional (handwriting/typewriter) perspective, every symbol is just as valid as any other, of course.

What I find to be a particularly bad decision is still using the comma in the "modern" context of computers. It's basically just doing things different for the difference's sake.

The UK/US influence wrt. the notation of numbers in the computer age has been huge for historical reasons, with the effect that e.g. I know of no single programming language that uses a comma for this purpose.

Clinging to the comma (for spreadsheets for example) adds no substantial benefit, but generates much confusion if you find yourself in an unexpected regional setting and your numpad starts to produce commas or your excel sheet stops returning sensible numbers.

So, at least from my naive perspective, using comma and dots as two different digit separators is a bit silly because of their visual similarity, while continuing to use the comma for specific regions while the world of computing basically has (de facto-)standardized on other formats is sillyness that is actively maintained. One could essentially call it "notational cruft".

It reminds me of the US not adopting the metric system - but at least, I can see a high mental barrier there (if you "think" in °F, °C will always be weird).

Edit: Of course, no such post would be complete without to the NASA Mars Climate Orbiter [1], where exactly this effect (physics calculations are usually done in SI units) had caused rather costly practical issues.

[1]: https://en.m.wikipedia.org/wiki/Mars_Climate_Orbiter


> In the german speaking regions, the apostrophe is generally used for this

The apostrophe (') as delimiter for 3-digit groups is partly used in Switzerland. In Germany we use periods (.) as delimiter.

>On the other hand, the Germans seem to use commas as decimal marks, which is even sillier.

For me decimal marks are more important than digit group delimiters. And a period (.) is smaller than a comma (,). Thus we use a comma for the more important features of numbers. This is how I explained it to me in school. Of course, as a programmer, I'm pretty muched used to periods as decimal marks now.

Edit: Add two sentences.


Spaces are becoming more prevalent because they cause the least confusion (I also think they look nicer/cleaner).


In Germany we don't use the apostrophe as a digit group separator. Instead, we use a small space, like so:

1 234 567

Edit: others suggest that a dot is used as group separator and it is sometimes seen. I believe that it is a more recent development, adding _even more_ confusion, if that is possible. DIN 1333 actively suggests _not_ to use a dot as group separator (WP: https://de.wikipedia.org/wiki/Schreibweise_von_Zahlen#Deutsc...)


Yes. Spaces or periods.

(Periods like in this sentence from a front page article on ZEIT online: "Dafür erhält GM weitere 900 Millionen Euro, so dass sich ein Gesamtvolumen von 2,2 Milliarden Euro ergibt. Betroffen sind rund 40.000 Angestellte in zwölf Fabriken.")

Edit:

>I believe that it is a more recent development, adding _even more_ confusion, if that is possible.

Interesting. Using spaces in handwriting is more confusing though. We never used spaces on blackboards in school (at least in the 90s).


And this is why CSV is one of the most broken file formats of all time.

"Comma-separated values" for spreadsheets doesn't work when comma is the decimal separator. Localized versions of Excel decided to store numbers in the local format, and just used some other delimiter for their version of "CSV" (NCSV?). So what happens is that you open an American .csv file in a local Excel, and you get an unexplained mess. The same happens going the other way.


This isn’t a problem if you surround your field data with quote marks (this is also required if the field data contains newlines).

https://en.wikipedia.org/wiki/Comma-separated_values#Basic_r...


You can select the separator when opening the file in LibreOffice Calc (don't have Excel on this machine but I know it's possible as well). Setting it to ";" fixes this for all European data sets I have worked with that use "," where "." would be used in the U.S. You can also specify the separator in pretty much every other tool that imports CSV (for example in R).


Pipe | is even better.


>Something that strikes me here: Coming from a european standpoint, using commas as a delimiter for 3-digit groups seems like a very silly decision. (...) On the other hand, the Germans seem to use commas as decimal marks, which is even sillier.

Huh? From what I've seen across the world, it's either "commas for decimals marks and periods for 3-digit group delimiter", or the inverse ("periods for decimal mark and commas for 3-digit group delimiter").

I can understand finding one or the other "silly" (depending on what you are accustomed to), but you seem to find both of these combinations silly.

Which is the third possibility that you consider non-silly?

(Note: reading again, you seem to imply the apostrophe for groups and period for decimal mark is the best option. It might be, but is it used anywhere outside of Germany?)


> Which is the third possibility that you consider non-silly?

Grouping separators include space (that's the SI style, using either comma or period as decimal separator), apostrophe and "˙" (DOT ABOVE). A few locales also use "·" (MIDDLE DOT) as decimal separators.

Many modern languages allow interspersing "_" in numbers which replaces spaces for grouping.


My basic gist is that

a) Using dot and comma provokes mistakes, thus I call it "silly".

b) Deviating from the dot as decimal mark in the context of computers is a bad idea™, because it violates a de facto standard in computing and introduces a myriad of regional conflicts without any benefit whatsoever.


Grouping by decimal point or apostrophe would be just as confusing in the UK (where grouping by comma is the norm).

In an ideal world, numbers would be formatted based on the user’s locale.


Reminds me of Stephen Few's classic book, "Show Me the Numbers". Excellent book if you have to report information to others that must be quickly understood.


Interesting point about right-aligning numeric data. I've been making data tables for a living for a decade, and never actually considered the reasoning behind it.


That seems very odd to me. I would expect most people should have learned the impact of this from the first time learning addition. I guess this is a good example of why you can never assume to know how someone else thinks about information.

  |  5
  |+ 10
  |----
  |  15

  |   5
  |+ 10
  |----
  |  15


As beart says, it helps with arithmetic by lining up the 1s, 10s, 100s. Also, it makes it easier to compare which number is larger/smaller--very useful when dealing with currencies.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: