I was wondering if the 1-based arrays (and option to change index base) would factor into this.
> OffsetArrays in particular proved to be a strong source of correctness bugs. The package provides an array type that leverages Julia’s flexible custom indices feature to create arrays whose indices don’t have to start at zero or one.
Array indexing is such a core thing and I don't understand why anything mathematical or scientific would start with 1.
> Because starting with 0 is neither math nor array indexing in general.
It very, very much is. Polynomials all start at a zero "index", as does just about every expansion I can think of (Fourier, Bessel, Legendre, Chebyshev, Spherical Harmonic, etc.) Combinatorics, too, make lots of use of zero indices and zero-sized sets. As for arrays, I'll leave it to Dijkstra[1] to explain why zero indexing is most natural. Zero indexing overwhelmingly makes the most sense in both math and computers because indexing is a different operation than counting.
Notice how you're ad homenim-ing the structure of the argument and not the argument itself? I do not at all see how putting quotes around that word invalidates the argument. I did so because mathematical literature doesn't refer to it as an index (rather as a degree as you mentioned), but it very much does index each monomial. There are an infinite number of index sets for each polynomial -- just as i can index the i'th monomial, so can (i - 7), or (i - 239842), or (i - pi) -- but one of them is obviously the most natural (pun intended).
>I do not at all see how putting quotes around that word invalidates the argument
When the argument is:
[0] is very, very much is [natural for indexes]
and as an example for that points to something that's not an index -- and the person making the argument knows it is not an index, so they have to put index in quotes:
Polynomials all start at a zero "index"
...then pointing this out, does invalidate the argument. It might not prove that the opposite is true, but it sure does invalidate the argument.
Notice also how there's no ad-hominen in my response (this or the previous one) as you claim. I argue against the case and the choice of example, not against who wrote it.
This is an interesting point, though, respectfully, I do still think it's ad homenim. Internet arguments being what they are I don't much care, but I offer my reasoning here to better understand your point. OP did not engage with any of the points made, merely offering another term (without any sort of elaboration or definition), and said
> Notice how you had to put index in quotes.
as the thrust of the argument. In saying that, they imply that I, the arguer, 'A' in Bond's article, don't actually know what an index is (so how could I have a cogent argument about 'correct' indexing?). As this is the only argument of merit, it seems as though OP is actually trying to counter the point by suggesting (attacking) something of the arguer (myself).
Now, it may be that this ad homenim is justified -- if I truly don't know what an index is then yes, I probably should not be making claims about them -- but it's still an ad homenim (and, possibly, poor form).
Of course, this is ascribing a lot to 25 words of text with little other context. I would be interested to understand if you see things differently/think I have grossly erred in my analysis.
>OP did not engage with any of the points made, merely offering another term (without any sort of elaboration or definition), and said Notice how you had to put index in quotes.
Yes. That's addressing the point you made.
Ad hominem would be: "You're a bad person/you have this or that flaw/etc (unrelated personal stuff)".
This is: "You put the index in quotes, because even you know that this is not an index. And in any case, this is not considered an index in math, it's a degree, which is a different thing".
I also didn't "merely offered another term", as if I made up some term on my own, or just offered on of several equal alternatives. Instead, I gave the correct math term for the thing described.
>In saying that, they imply that I, the arguer, 'A' in Bond's article, don't actually know what an index is (so how could I have a cogent argument about 'correct' indexing?).
It would rather imply the opposite: that you know what an index is, and you know that the thing you applied it to, is not an index (which is why it was put in scare quotes).
> In saying that, they imply that I, the arguer, 'A' in Bond's article, don't actually know what an index is (so how could I have a cogent argument about 'correct' indexing?).
That's not what they are saying. They are saying you know what an index is so well that you correctly put quotes around your usage of the term, because you understood it's not in fact a technically correct usage.
Now calling an argument poor form... that's closer to ad hominem.
For some of these polynomials such as Fourier polynomials, it is natural to think about negative subscripts from a pure mathematical perspective. While these can mapped into non-negative integers, it is often intuitive to use the "negative subscripts" as indexes necessitating methods such as `fftshift`. For many of these polynomials the concept of where they "start" is arbitrary.
Math (usually) uses 1-based indexes because those parts of math started before the concept of zero as a number, and then the convention persisted, even down to Matlab.
There are many similar path-dependent conventions in human culture. E.g. percentages originated before the concept of decimal fractions, base-sixty time units come from ancient Mesopotamia, and conventions about multi-dimensional array memory layout are based on the convention for drawing matrices on paper.
Most common mathematical sequences and series work better (more naturally/clearly) when zero indexing is used instead, and off-by-1 errors are a problem in mathematics just like computing (but less of a problem, because notation errors get silently corrected in readers’ heads, and don’t actually have to be interpreted strictly).
Math traditionally has had some bad notation from a formal point of view, because humans are good at coping with bad notations (or going back and forth between variants), unlike machines and formal systems. Computer science being a (more) formal science (vs math which is overwhelmingly not done formally), it has criticized some traditional math notation which are ad-hoc and not nicely formalizable (and put forward variants that are actually better behaved in terms of mathematical structures).
For indices: indices are about referencing elements of finite ordered sets, say of size N. Hence the 'abstract' indexing set for N elements is the ordinal N. The most canonical way to represent it is to take the length-N prefix of the natural numbers (eg 0-based indexing, von neumann ordinals), which happen to have all sorts of additional structure (eg mod-N arithmetic). This is also consistent with the offset view (the i-th element is at offset i). The fact that people tend to start ordinal numbers at 1 doesn't change anything that mathematicians working with ordinal numbers take them to start at 0, for the same reason we start naturals at 0.
See also: notation for higher derivatives https://arxiv.org/abs/1801.09553; a bit further but in the same vein: notations for free variables in programs as de-bruijn indices (or some variant thereof) (it's further because it's practical for doing proofs, but not for writing concrete terms). There are probably other instances.
It makes iteration less error-prone too when the index of the last element is equal to the length of the array. In C it’s pretty easy to iterate past the end of an array if you use <= by mistake in a for loop, or forget a “length - 1” somewhere.
I'll start by saying that I greatly prefer 0-based, and have used but 0- and 1-based indexing, but the choice is largely arbitrary.
0 makes sense as the '0-th offset' when thinking from a pointer perspective, but I often find when teaching, that 1-based comes more naturally for many students (the 'first' item).
You mention mathematical or scientific work...but I often/mainly see enumerations (such as weights x_1, x_2, ... x_n or SUM 1 to N) start with 1, so for these 1-based can be a more natural/direct translation of mathematical notation to code.
My experience is that 0-based offsets (and use of < or even != for upper bounds) mean that I should almost never have to write something like idx - 1 or idx + 1.
I came to 0-based offsets later in my career, having started with Matlab. So I have some real experience with 1-based offsets. Experience that was 'untainted' by being used to a different option. I much prefer 0-based.
Especially because I now sort-off have a linter rule in my head 'if I am writing i - 1 then I am making a mistake or doing something the wrong way'. Which has been quite successful.
Or Fourier coefficients :-) Or pretty much anything where the index/subscript is related to the math itself.
Math textbooks and papers tend to use 1-based subscripts when it doesn't matter. It's hard to come up with examples where starting at 1 facilitates the actual math.
Just for consistency: a_n is the coefficient of x^n, so the constant term ends up being a_0.
Based on my experience, numbering starts from one (like (x_1, x_2, x_3) as point of R^3) and off-sets from zero, e.g., when dealing with discrete time, t_0 is the first.
As other posters noted, in mathematics both 0 based and 1 based indexing is used.
When dealing with matrices and vectors (including data tables and data columns), there is a strong preference for 1 based indexing: first row, first column, first entry, etc. Most matrix and vector based algorithms in literature use 1 based indexing. Programming these in a language with 0 based indexing is a mess, and a common source or errors.
When dealing with sequences, especially recursively defined ones, there is usually an initial value (indexed with 0) and then the n-th value is obtained by n applications of the recursive step, so 0 based indexing makes more sense, but in literature there is no fixed convention, and you can find examples with 0 based and with 1-based indexing. Another example of 0 based indexing in math are polynomials (and in extension, power series) where the index is the degree of the term, or in general any functional series where the 0-th term is the constant term.
> Array indexing is such a core thing and I don't understand why anything mathematical or scientific would start with 1.
Because that's how maths work? Literally everywhere in maths you count from 1, except in software engineering. That's why. I hope that clarified your confusion.
My hot take is that 1-based indexing is often a mistake in math too. It's also not universal, even within math. And linear algebra doesn't need 1-based indexing either, and some operations are even more easily expressed with 0-based indexing.
What index to start with only strongly matters when the indexes have semantics. Otherwise you should just treat it as an opaque index, i.e. eachindex(), keys(), etc. In math when there are semantics, the indices usually include 0. When not (vector components, matrix indices, etc), they usually (but not uniformly) don't.
The one nice side-benefit of Julia's mistake in adopting 1-based indexing is that it provided an extra impetus to build machinery to handle arbitrary indexing, though too much code still doesn't work correctly, and code still gets written to only handle 1-based arrays.
> What index to start with only strongly matters when the indexes have semantics.
Which in everyday computing (as opposed to mathematics) they often do, and those cases are (most?) often, in human terms, much more natural to start from 1: "I have an array of N elements. The first of a bunch of things is thing number one, and the last of N things is thing number N." Hence:
N_things: Array[1-N] of Thing;
for i := 1 to N do begin
Whatever := Whatever + Whateverize(N_things[i]);
end; // for i := 1 to N...
Yeah, that's how old I am: That's Pascal. (With some declarations skipped, and I may have misremembered some syntax.) The canonical example is of course the original Wirth-style max-255-ASCII-characters fixed-length[1] String type: In a string of length N, the nth character is at position n in the string. Character number N is the last one.
> The one nice side-benefit of Julia's mistake in adopting 1-based indexing is that it provided an extra impetus to build machinery to handle arbitrary indexing
1) Arguably, as per the above, not a mistake.
2) Muahaha, "build machinery"? No need to build anything new; that's already existed since the early 70s. (Yeah, that's how old I am: Not adding 19 in front. There was only one "the seventies".) It's not like starting at 1 was mandatory; you could well declare
My_fifty_things: Array[19-68] of Thing;
And then "for i := 19 to 68 do ..." whatever with it, if those specific numbers happened to be somehow essential to your code.
(At least in Turbo, but AFAICR also in original Wirth Pascal. Though probably with the max-255-ASCII-elements limitation in Wirth, and possibly also in Turbo up to v. 2 or 3 or so.)
__
[1]: Though from at least Turbo Pascal 3 (probably earlier; also think I saw it on some minicomputer implementation) with the backdoor of changing the length by directly manipulating -- surprise, surprise, it exists! String was a built-in type with its own implementation -- the length bit at index [0]. Better start out with your string declared as length 255, though, so you don't accidentally try to grow it beyond what's allocated.
> often, in human terms, much more natural to start from 1
This meaning of natural is highly cultural dependent. It took the Greeks a startlingly long time to accept that one was a number (because it's a singleton), much less zero. I do not e.g. want arrays that can't have length one, because they have to be containing a number of things.
> No need to build anything new;
Well, no, not "new". Arrays with arbitrary bounds is a well-trod path. But they still had to make it work in Julia: CartesianIndices, LinearIndices, and overloading of "begin", and "end" keywords, etc. And the radical dependence on multimethod dispatch meant they couldn't quite just reuse existing work from other languages.
I'm not "explaining your reasoning" because I don't agree with a single work you just said.
> In math when there are semantics, the indices usually include 0
Complete nonsense. For example, in a Laurent expansion you start in the negatives and go up. Now you're gonna say "but that's an exception, I said _usually_". But it's not, this is the general case.
> Array indexing is such a core thing and I don't understand why anything mathematical or scientific would start with 1.
From data analytic point of view, indexing should start with 1. When we analyze a data table, we always call the first row as the 1st row, or row #1, not row #0. It will be very strange to label rows as 0, 1, 2, 3, .... It may be fine for people with Computer Science background. But it would create so much confusion for almost everyone else...
It causes problems for people with a CS background too. I once numbered machines in racks with zero-indexing (so that they could match up with zero-indexed ip addresses). Even though literally everyone who touched those machines had CS background: DO NOT DO THIS.
Yes. A German friend of mine moved into her student dormitory in the US, and when she was told that her room was on the first floor, asked whether there was a lift, because she had a heavy suitcase...
Having said that, given that there are basements (in Europe, at least), it makes sense to call the ground floor 0. We are dealing with integers here, not natural numbers.
But floors of multi-storey buildings are a pretty unique exception in the real world in having a characteristic where zero -- the number of stairs you need to climb from the ground floor -- has an actual tangible meaning (on the ground floor).
How many other such examples can you (editorial you; anyone) come up with? Not many, I'd bet.
I thought you'd be wrong, and immediately came up with:
- Hours after midnight. The (non-anglosaxon) watch goes from 0:00 to 24:00 (the latter is useful for deadlines: The proposal must be submitted by Friday, 24:00 (which coincides with Saturday, 0:00)).
Year -- or rather, decade, century, and millennium -- numbering. We're only in the second year of the second decade of the twenty-first century, so the decade will end and the next start at the end, not the beginning, of 2030.
But this is perhaps more of a problem with numbers and zeroes, and ends and beginnings; people don't get that 10 is the last of the 00s, not the first of the 10s. The very first year was numbered 1, not 0, so that's not the problem. Or, it kind of is: People would be right, if the first year had been 0.
> Array indexing is such a core thing and I don't understand why anything mathematical or scientific would start with 1.
Counting things is such a core thing to humans that when we have a bunch of N things we think of them as thing #1 to thing #N. We start counting from 1, not 0.
Indexing from 0 in computing is adapting the human mind to the computer, purely for performance reasons that may have been relevant in the 50s or 60s but were beginning to be obsolete by the 70s. It was done so you could access elements of an array by the simplest possible calculation of your offset into heap memory. When your first element is stored at Starting_address, you need i for that first element to be = 0, just so you don't need to have the compiler add another constant term for each element to "Element is at Starting_address + i * sizeof(element)".
Would have been trivial, even then (as Wirth showed) to add that constant term calculation to compilers, but it was done without in C because that eliminated one whole integer operation from each (set of?) array access(es).
In stead, we got the mental gymnastics of
for(i=0, i++, i<=N-1) {...}
and its many variations (in stead of just for i := 1 to N...), which surely have caused orders of magnitude more headaches in off-by-one bugs over the years than it saved on performance.
> OffsetArrays in particular proved to be a strong source of correctness bugs. The package provides an array type that leverages Julia’s flexible custom indices feature to create arrays whose indices don’t have to start at zero or one.
Array indexing is such a core thing and I don't understand why anything mathematical or scientific would start with 1.