Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What I'd like to tell the author.

Don't like null? Don't use it.. and you find null propagation ruining your abstraction then treat it as an error and fix your program.



It actually turns out that there are better ways of solving problems with null that don't involve just telling programmers to go fix their programs. There are alternatives, like encoding whether a value can be NULL (or some semantic equivalent) into the type system. Rust and Haskell are examples of languages where this is the only way to do things, and C# and TypeScript are examples of languages where you can selectively (and under certain condutions) make distinctions between values which can be null and which cannot be null.


> Rust and Haskell are examples of languages where this is the only way to do things,

You know Rust has null too, right? https://doc.rust-lang.org/std/ptr/fn.null.html

Rust has both references (which can't be null†) and pointers (which can be null, but can only be dereferenced within "unsafe" blocks or functions).

† Actually, they sort of can: an Option<&T> takes the same space as a &T, with the None variant of the Option being represented as a null behind the covers.


Read the statement as "the only typical way to do things" or "the only safe way to do things" or "the only way to do things without using a completely different type which supports NULL". The fact that Rust has unsafe non-GC'd pointers which can be NULL or otherwise invalid puts it in the same camp as Haskell, Go, C#, etc. but in Rust you are more likely to use references which can't be NULL.


> Actually, they sort of can: an Option<&T> takes the same space as a &T, with the None variant of the Option being represented as a null behind the covers.

There is no “sort of can” here. It just so happens that None uses the same representation as null, but by that logic you’d say that u64 can be null because 0 has the same representation as null.


Perhaps possible in isolation or on a small team. I don't think this is practically achievable on teams that have grown past a certain size, though.

To paraphrase Carmack, "Any syntactically valid code, that the compiler will accept, will eventually make it into your code base." [1]

[1] https://www.youtube.com/watch?v=Uooh0Y9fC_M


Not if you review your code reasonably well and use static analysis plus some testing, making it a process to not get bad code into your codebase.


At some point one will include external libraries, written with different styles and standards. And then will modify the libraries, making them effectively part of the core code base.


OP here - as a matter of fact I try to do just this, starting with the database. I'm mostly a data person so I try to think through, as deeply as I can, what I should expect in every table - there has to be a sensible default and if I can't find one then I rethink my design. You'll probably disagree with me and grunt out another single sentence missive, which is fine, but I think it's worth taking some extra time and using Null as a bit of a warning. It's a crutch! A way to stop thinking and say "whatever I don't know what this value is supposed to be so... it's null. Let's go shopping!"


Frankly, the idea that there must be sensible default worries me.

Take a database of people. There is literally no sensible default for name, age, gender, height, weight, social security...

If you’re amazon, your products have no sensible default for manufacturer, shipping weight/size, delivery address...

In fact, for just about any real-world data, there simply is no sensible default for anything at all. Most “sensible” defaults will eventually bite you in the arse. The only sane way to keep nulls from your DB is to refuse inserting incomplete data in the first place, and propagate the error to the user. Heavens save your team if you’re dealing with batch data and insist on not allowing nulls in the DB, though.

You can sweep this mess under a rug and pretend you have no nulls by turning things into relations that are allowed to be empty — “there are no delivery_address rows for this user” — but that’s a null in sheep’s clothing. Either your application knows how to deal with the query coming up empty, or it doesn’t.


What do you use in a database when you have a field where you literally do not know what the value should be?


If you have a PEOPLE table and some birthdates are unknown, then remove the "birthdate" column and make another table called PEOPLE_BIRTHDATES with a "birthdate" column and a foreign key pointing to PEOPLE. Now your queries can have lots of left joins. The results will still have nulls, however.


Which is the reason why you shouldn't write outer joins.


So if we don't know the customer's birthdate we can't serve her? I can imagine a problem with that...


Sigh.

Where have I said any such thing ?


If there's no row for the customer in the joined table, the customer won't show up in an inner join.


Great. Now if you can explain to me where you got the idea that a join (inner or otherwise) is the only possible way to query two tables then we might get somewhere. Because you can also just do two queries. And no, that does not necessarily mean "two roundtrips to the DBMS" (which I know perfectly well is undesirable). There are techniques for avoiding that. Perhaps not in SQL, but that's a reason you should be pressing the vendors to improve SQL. Not for you to agree to the status quo of sticking with the vendors' old bypasses-and-hacks cheating bag.


Haha OK then use a UNION... oh wait we're gonna have NULLs with that too. One suspects you'll also have some vague objection to this point, but if the only way to address that is to wait on somebody to invent an "improved" SQL, one won't worry about it too much.


That "improved SQL" was already defined in the previous century, and has been implemented as well. Your ignorance drips off of every word you write.


E.F. Codd literally designed null into the relational model. I don't think calling someone ignorant is very helpful.


But the demeaning ridicule that gets thrown at me is ?

(BTW I doubt very much that "Codd designed null into the RM". Even his 12 rules mention only "a systemic way to deal with missing information", not "null".)


> What do you use in a database when you have a field where you literally do not know what the value should be?

You don't.

If a value may not be present for an entity, it's not an attribute of the entity in question, it's an attribute of another entity that has a (0..1):1 relationship to the entity in question.

Normalization eliminates NULL.


That's great. Now I do a query. Maybe I use a join. If a row has the "0" case of that (0..1):1 relationship, what do I get?

Or maybe I don't do a join. Maybe I do a separate query. If the query comes back with zero rows, then I... what?


What do I get ? You get what you ask for.

Then I ... what ? Then you do what needs to be done as specified by the business in the case the queried piece of information is unknown.


> What do I get ? You get what you ask for.

In the join case, don't I get a NULL in the row that comes back if there isn't an entry in the other table? Or do I just not get a row?

> Then you do what needs to be done as specified by the business in the case the queried piece of information is unknown.

Sure, but how do I represent that condition in my software? With a different class/structure? With a flag that indicates that the other field isn't valid? Or with a null?

From where I sit, normalization doesn't make the problem go away at all.


So, no use any OS, library or code-base on top of C, C++, Java, .NET, Javascript, Sql...???




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: