When TDD Doesn't Work

Nursie · on April 30, 2014

IMHO TDD, like a lot of the agile stuff, is a good idea with solid foundations that people get wrong all the time and end up making things worse with.

Agile was supposed to ease up on process and make teams adapt to changing requirements. It wasn't supposed to use up >30% of your working time just to service the methodology, but that's what it ends up doing when you get in the Agile Evangelists.

TDD was supposed to ensure more correct software at the cost of some overhead (perhaps 30%?) by making sure every unit had its tests written ahead of the code. In practice I've seen it kill productivity entirely as people write test harnesses, dummy systems and frameworks galore, and never produce anything.

A combination of these two approaches recently cost an entire team (30+) people their jobs as they produced almost nothing for almost a year, despite being busy and ostensibly working hard all year. We kept one guy to deal with some of the stuff they left behind and do some new development. When asked for an estimate to do a trivial change he gave a massive timescale and then explained that 'in the gateway team we like to write extensive tests before the code'.

The only response we had for him was 'and do you see the rest of the gateway team here now?'

forgottenpass · on April 30, 2014

IMHO TDD, like a lot of the agile stuff, is a good idea with solid foundations that people get wrong all the time and end up making things worse with.

I think the reason it can go sideways is because the advocates and casual adherents don't accept that for some teams a methodology really may not provide the claimed benefits, or it costs too much elsewhere. It's easier to say "It's not [methodology] that isn't working, you're doing it wrong."

That's a really attractive answer, made all the more tempting because it's sometimes true. But not every software team is the same, operating in the same constraints. People tend to generalize based on their own experiences. Lessons learned about what works for X doesn't necessarily apply to Y. The thing that has been lacking in the TDD/Agile/Insert Fad is a higher level "this is why the ideas worked for us, here are the component pieces and their purpose, this is how to determine the pieces to adopt and how to tailor them to your organization."

You could say that someone out there is making that case, but their voice is drowning under the snakeoil salesmen. I don't hear it up front, the rare times I do hear it is deep into the "you're doing it wrong" conversation when any sense of perspective in the discussion has already been beaten to death.

spion · on April 30, 2014

I completely agree. I think these issues stem from our industry's tendencies to abstract the problem away from the solution before teaching the solution. This is the wrong way to teach.

If you show other developers how you solved your particular problem (how your solution came to be) and explain why you used that approach, many will quickly be able to discern how and when to apply your solution to their particular problem area. If instead you just give them a solution, they're missing a piece of the puzzle and may arrive at the wrong conclusion.

Other developers are also pretty intelligent folk - don't do your deduction for them. Let them do it themselves.

I think our industry suffers from the lack of studying and describing the history of concrete software systems. It could be done so easily now - we have the entire source code history available, we have the history of issues. Books could be written about the evolution of popular software systems - especially open-source classics (e.g. emacs or the linux kernel or Firefox). Those books would describe the problems encountered during the development and their solutions, the design / architectural decisions, as well as the evolution of the development process.

Studying software history should also help us not reinvent the wheel badly. If we studied history perhaps we would've known more about NLS or Smalltalk or why other ideas were invented. How many of us know why the concept of objects was invented in the first place? How many know why classes were invented - what particular problem prompted someone to invent them, and what the circumstances were at that point in time?

The reason why this history is so important is that when inventions go through the reuse process, they're never quite that perfect at solving those newer problems compared to the original problem. Therefore the further away we are from the original problem, the less likely we will be to understand the general aspects of the solution or to re-use the original thinking process to adapt the solution to our needs.

Finally it will also allow us to gain at least some of the knowledge that can presently be gained only from experience and mentoring - which is always a good thing.

parasubvert · on April 30, 2014

The emphasis I take away from your story is that something may be a good idea, even foundational, but there's no guarantee your team will interpret or execute it correctly.

If you do the math on how much time should be taken up by Agile meetings, it's about 10% of the team's time. Add another 10% for vacation, illness, town halls, dentist appointments, etc. that leaves 80% to understanding requirements, writing code and testing code. Yet the market for "agile consultants" is one of lemons - you don't know if this guy/gal is a huckster or truly working for your success.

Similarly for TDD, I've never heard of TDD being about a team writing tests em masse before code - that wasn't something Kent or Bob Martin ever recommended anyway.

Ultimately this is why I believe the most important roles on a software team are the "management roles" - Product Owner first and foremost. any solid Product Owner I've worked with would have mandated a demo after every iteration and ejected the software team management very quickly if there were no results. Better to punt the problem child early instead of taking the whole team down later!

normalhuman · on April 30, 2014

> IMHO TDD, like a lot of the agile stuff, is a good idea with solid foundations that people get wrong all the time and end up making things worse with.

So one could reasonably suspect that people who "get agile" are just talented and would be good developers anyway. Occam's razor invites us to assume that agile has no effect. Are there any scientific studies on the effectiveness of agile (or TDD), or is this just a homeopathy situation?

jdlshore · on April 30, 2014

If you're going to ask that, you should tell us which studies you used to select the other techniques you use.

(It turns out that there are, tons actually, but software engineering a difficult topic to study and I wouldn't draw any conclusions from them. The vast majority of controlled software engineering studies I've seen are conducted on students using problems of trivial scope. The uncontrolled ones aren't much better. Basic questions like "what do you mean by productivity" have yet to be answered well.)

tbrownaw · on April 30, 2014

Are there any scientific studies on the effectiveness of agile (or TDD), or is this just a homeopathy situation?

Yes (and yes) on TDD. None of the ones discussed in Making Software: what really works and why we believe it were done particularly well, and the less bad ones tended to show no effect or contradictory mixed results (ie, disagreed on what went better vs what went worse).

"Agile" is far too ill-defined to actually study usefully. Specific variants or individual practices could be studied.

hox · on April 30, 2014

nonsense. I'm not an advocate of large process by any means, but a number of principles of the agile movement can benefit developers of any skill level or experience. and more importantly it can help a team collectively more than it can help the individual. the trick is imposing these principles carefully, which almost every agile leader I've met fails to do.

I've never seen anything wrong with more communication between stakeholders and adapting a solution to meet their ever-changing needs.

normalhuman · on April 30, 2014

> I've never seen anything wrong with more communication between stakeholders and adapting a solution to meet their ever-changing needs.

Surely there's a point where more communication starts being detrimental. Reducing it to the absurd: if you spend 100% of the time communicating then you have no time left to actually build the thing. So there's a trade-off, as usual.

It could also be argued that ease of communication to deal with "ever-changing needs" encourages more superficial requirements and less deep thinking about the actual problem, leading to wasted effort and lower quality results.

Maybe you are right, maybe the above paragraph is right. I don't know and neither do you. Replying "nonsense" is not really an argument.

tbrownaw · on April 30, 2014

I've never seen anything wrong with more communication between stakeholders and adapting a solution to meet their ever-changing needs.

There is an effect where people who are invited to second-guess themselves become less happy with their initial decisions. There is another effect where cost or price or effort or social-standing is associated to quality.

If you give the impression of falling over yourself to serve the whims of your stakeholders, you're doomed. If you set reasonable limits... well that requires experience to do properly, and that same experience could be used to twist a more traditional methodology into something useful.

Perhaps agile has a stronger sink-or-swim learning curve? You either learn quickly and succeed very well, or you never figure out what went wrong.

DavidWoof · on April 30, 2014

"'in the gateway team we like to write extensive tests before the code"

Which, ironically, is the opposite of TDD. One advantage of classic TDD unit testing is that your tests grow with the code. One of the dangers of not doing TDD is the situation above, where your integration tests require massive scaffolding and custom frameworks upfront, essentially turning the process into waterfall.

reedlaw · on April 30, 2014

I've never seen a project fail because of too much testing. Maybe because of an over-emphasis on process, but not because of writing too many useful tests. On the contrary, projects I've worked on fail or approach failure because of lack of clear requirements, whether in unit test form, BDD, or well-written user stories. If it's not clear what a product owner wants then it's impossible to test and impossible to implement to match the owner's expectations. TDD is useless if you don't know what you're trying to build.

Nursie · on April 30, 2014

Too much testing, no. Too much time spent building test frameworks, simulators and 101 other things before a line of code is written? Well I just witnessed it last year.

Not that it was the only factor. Heavy 'agile' process was certainly part of it. They also threw everything away and restarted again at some point, likely due to changing requirements. But it was part of the picture that added up to nothing getting done.

arjie · on April 30, 2014

Jesus, what a nightmare! You look back on your work and you've only produced tooling instead of solving the problem you set out to solve. An easy trap to fall into, but doing it for a year is something else.

Nursie · on April 30, 2014

It wasn't quite as bleak a picture as maybe I've painted.... but not far off either. Must have been pretty depressing for the team as well as for the folks running the show.

On the positive, I think they all got good roles with our competitors!

JabavuAdams · on April 30, 2014

The problem is that it's entirely reasonable and common for a product owner to not know what they want. It's not some inconvenience that can just be ignored.

It's a fairly uninteresting problem if you can fully specify it before development starts.

Specification should be more like a conversation. That's the whole point of a lot of incremental and customer-focused development.

The problem is when you're under time constraints or technological constraints that don't allow for rapid iteration.

reedlaw · on April 30, 2014

The discipline of writing tests tends to tease out design requirements fairly quickly. Because as soon as someone takes the time to think carefully about the implications of what the product owner wants, it usually results in a feedback loop eliciting further details. Like a conversation, yes, but one which results in documented, tested implementations.

collyw · on April 30, 2014

>Because as soon as someone takes the time to think carefully about the implications of what the product owner wants, it usually results in a feedback loop eliciting further details.

I am not a TDD person, and that sounds pretty much what I do. Why do I need a load of tests to find out more details of what I am trying to implement?

dragonwriter · on April 30, 2014

The benefits of testability as a focus of requirements gathering and automated tests in the source tree as a concrete artifact of that requirements gathering is:

(1) If something is specified well enough to be automatically tested, there is clear agreement (and not superficial agreement hiding different interpretations).

(2) If automated tests are created and in the source tree, the unambiguous knowledge in #1 is preserved which makes resolving future questions of expectations and intent of the existing code base easily resolvable.

lmm · on April 30, 2014

For most people the easiest way to see what using an API looks like is to use it. And the emergent design that results can often be better than what you would have designed if you tried to design by thinking about it without writing code. http://c2.com/cgi/wiki?WhatIsAnAdvancer

collyw · on May 5, 2014

That's not really answering the question I asked.

And writing an API takes up a small proportion of my time. Usually its generating database queries, and spitting them out as web pages, and presenting the data in a way that non technical users can understand.

lmm · on May 6, 2014

If you're spending a substantial proportion of your time generating database queries or rendering results to web pages you should look at using better libraries for doing so, and/or improving your own library-type code.

If you're talking about figuring out how best to present the data, that's not really "development" per se. But it is another kind of API, and again something that you can design more effectively using a test-oriented approach (e.g. start by figuring out the use cases for what a user wants to find out from the information).

collyw · on May 6, 2014

How do you know I am not using good libraries libraries for writing my code? (I usually use Django which seems to be well respected). Again, that has nothing to do with test driven development.

I already start by figuring out what the user wants. Again, nothing to do with writing a load of tests first.

All your suggestions for how TDD will improve my development seem to point to exactly what I do already without writing tests first.

lmm · on May 7, 2014

> How do you know I am not using good libraries libraries for writing my code?

You said "writing an API takes up a small proportion of my time. Usually its generating database queries, and spitting them out as web pages...". Which suggests to me that you're not using libraries effectively, because those things are a tiny proportion of my time. The business logic - i.e. the part that's actually specific to your problem - should be where you spend most of your time, and that's the part where TDD is effective.

> I already start by figuring out what the user wants. Again, nothing to do with writing a load of tests first.

Writing a test for a use case ensures you actually understand it. It helps you find more possible problems or misunderstandings in the same way that a blueprint is an advantage over a sketch. And it's probably the most effective way to communicate these use cases to developers you're collaborating with.

collyw · on May 8, 2014

Wow, so without knowing very much about my work, you are able to tell me where I should be spending the majority of my time.

And how does writing a test ensure that I understand a problem any more than not understanding it? If I don't understand it properly I will likely write the wrong test.

jdlshore · on April 30, 2014

> TDD was supposed to ensure more correct software at the cost of some overhead (perhaps 30%?) by making sure every unit had its tests written ahead of the code.

Not exactly. Remember that TDD came from Extreme Programming, and the radical idea of Extreme Programming was "embrace change:" the idea that you could accept—no, desire—requirements changes after you started programming.

At the time, all software design was supposed to be done in advance; to do it any other way would lead to madness. The (fictional, it turns out [1]) "cost of change curve" said that a change in requirements would cost 20-150x as much if made after coding began, and thus all requirements had to be nailed down in advance.

XP said, "what if we could flatten the cost of change curve, so that the cost of a change is just the cost of implementation, regardless of when the change is suggested?" That's the whole raison d'être of XP.

The cost of change curve was flattened by using evolutionary design. The way you got evolutionary design was with four practices: pair programming (to improve quality), simple design (to avoid painting yourself into a corner), refactoring (so you could change the design), and... TDD. So you could refactor safely.

TDD is about enabling change. The quality benefits are also valuable, but not the main point. That's why TDD'ists care so much about fast tests—you need quick feedback when you're doing design refactorings.

[1] Laurent Bossavit investigated the literature for the source of the cost of change curve claim and determined that it was based on people graphing their opinions, not empirical data. Over time, those opinion graphs were assumed to be based on real data, but they weren't. https://leanpub.com/leprechauns

AnimalMuppet · on April 30, 2014

> Agile was supposed to ease up on process and make teams adapt to changing requirements. It wasn't supposed to use up >30% of your working time just to service the methodology, but that's what it ends up doing when you get in the Agile Evangelists.

Well, I've worked in a (non-agile) environment where the methodology ate up way more than 30% of our working time. If an Agile Evangelist could have gotten us to 30%, most of us would have been ecstatic. (It wouldn't happen, though - we were FDA regulated as a medical device manufacturer, which imposed huge overhead requirements.)

brudgers · on April 30, 2014

30% overhead is often a function of team size, not methodology. A team of 30 is not likely to be agile in a meaningful sense, there are too many coordination vectors and communication channels.

And if those in a position to ask about the rest of the team aren't sold on agile to begin with, the odds of it working are inversely proportional to the odds of people just going through the motions while fearing for their jobs and polishing their resume for a year.

Nursie · on April 30, 2014

There were various sub-teams that had their own working areas and their own sprints, it wasn't one huge 'agile' team of 30.

>> And if those in a position to ask about the rest of the team aren't sold on agile to begin with, the odds of it working are inversely proportional to the odds of ...

So we're agreed, it's not a silver bullet. It might work, it might not, and people pretty much have to be believers to get any benefit out of it.

Is this starting to sound like a religion yet?

jjfine · on April 30, 2014

You pretty much have to be a believer in anything to get a benefit out of it...especially something that's challenging and requires discipline.

_d8fd · on April 30, 2014

I once asked an experienced developer what hr thought about Agile and TDD. He responded by saying that they are useful tools, when used by people who know what they're doing.

There's no replacement for working with quality people, and no tool prevents you from being a moron.

feketegy · on April 30, 2014

TDD and agile in general requires discipline.

You can't expect somebody to drive a car if they don't even know what a gas pedal is.

Nursie · on April 30, 2014

Software development requires discipline :)

I've seen it said here before - you can't expect a mediocre team to become world class by forcing them into a set of methodologies, you'll just get people who are mediocre at doing it that way too.

rondon2 · on April 30, 2014

What is required is shades of grey. Here is a simple example.

Lets say our team's methodology includes documenting code changes.

Some commits should require API Documentation updates, e-mails to the entire team, an entire separate document about the changes.

Other commits may only need a short comment like "removed extra whitespace"

lmm · on April 30, 2014

Fallacy of the grey. Yes, a good team will be better than a bad team for any reasonable methodology. That doesn't mean there aren't methodologies that are better than others.

Nursie · on April 30, 2014

I don't disagree, but I don't think that becoming more Agile* is necessarily always a very good methodology.

*(capital A, for what it has become at the hands of process consultants rather than what was intended originally)

FourthProtocol · on April 30, 2014

Not doing TDD and waterfall in general requires discipline.

GFK_of_xmaspast · on April 30, 2014

That sounds a lot like a failure on the part of management.

yanowitz · on April 30, 2014

Except for this statement, beware of absolutism in statements of How To Do Software Development.

The specifics of this debate are kind of uninteresting because of the (general) lack of nuance from various sides, albeit all informed by their own lived experience.

OTOH, the recurrent reality of <insert topic> debate in our industry is very interesting.

I think it's some combination of:

* a bunch of problems are still unsolved

* software is so powerful that sub-optimal solutions are usuallly Good Enough

* industry amnesia, driven by developer/engineer turnover

* the relative infancy of the industry, especially as a function of the rate of change (I'm not sure how you would normalize for rate-of-change, social structure and communication speed, but it would be interesting to compare these debates to medieval guilds in Europe).

* ???

To take up the first two above:

Things are better than they used to be -- as late as the 90s, code reuse was still an unsolved problem. Of course, code quality is still hard--we are reusing broken code, but at least we "only" have to fix it once.

I think it's hard to overestimate the importance of Good Enough as a factor in these recurring debates. Everyone can be right from the business's point of view--tons of money is still being saved. Once you get past the initial ramp of a company, how to structure for continuing velocity of a team and make headway in your chosen market(s) seems like a different optimization problem than what got you there (again, not a new topic!)

Just some partially formed thoughts...

protonfish · on April 30, 2014

Was that an implication that code reuse is solved? I'd say there are more options but their value is still a matter of debate. I have a hard time imagining that code written today is better than 30 years ago. (We do have much better source control tools so at least that's something.)

DougWebb · on April 30, 2014

I doubt that code written today is better than 30 years ago. We have much better tools, including source control, and much better hardware, but that's enabled us to write vastly more code than was feasible 30 years ago. I think there's been a choice about whether to use that extra power to write either better code or more code, and we've veered sharply towards writing more code. Some if it is better, but most of it is probably worse.

30 years ago was right at the start of my 'career', learning to program my ZX-81 as a pre-teen. I learned to program it in Z80 machine code, because it only had 1K of RAM and you couldn't fit much BASIC code into that space. (Also, there was no assembler. You had to write the assembly code on paper, manually convert it to machine code, and type in the bytes into a comment line in a BASIC program.) My experience with the ZX-81 isn't that much different from what the grownups had gone through over the previous 10-20 years with teletypes, mainframes, and the other early PCs.

The degree of low-level fiddling and knowledge needed to program computers back then led to much more careful analysis and understanding, I think. No one had to tell us to design first before we sat down and started coding, because there was no other option. And not just design; we had to run the code in our head and debug it before it was ever put into the computer. That's a skill that I've pretty much retained today, 30 years later, and occasionally I still use it. (eg: During code reviews, or thinking about a problem I'm working on while in the shower.)

Younger developers, mostly, haven't gone through that kind of experience. I think it makes a difference. I'm really happy about the existence and popularity of the Arduino and similar open hardware platforms because they bring back that bare-metal level of software development. I'll bet that today and in the future, the best programmers are going to have a shared experience of working on devices like that when they were young.

tluyben2 · on April 30, 2014

I am around your age and started then as well, also Z80. In my area it was rather normal to just write and run hex in your head. I still know the whole instructionset in hex as I wrote so much code with it. I totally fail to see the reason why anyone did use an assembler for that; I can just read and write the hex codes and did. Besides the time it took to get back after a crash it did not take much more time to 'get shit done' then as it does now. There were just vastly less programmers and thus less code was written. Unfortunately I don't have my 80s code anymore but my output has not changed much; better code and the code does more because of the libraries we have now, but it's around the same amounts. Running code in my head is great and was easier with something limited as an MSX (or zx for that matter); but it gives me a lot of advantage I believe.

praptak · on April 30, 2014

About code reuse - here is some disagreement about whether it is a good thing in the first place: 'I also must confess to a strong bias against the fashion for reusable code. To me, "re-editable code" is much, much better than an untouchable black box or toolkit. I could go on and on about this. If you’re totally convinced that reusable code is wonderful, I probably won’t be able to sway you anyway, but you’ll never convince me that reusable code isn’t mostly a menace.' This is from Donald Knuth: http://www.informit.com/articles/article.aspx?p=1193856

jarrett · on April 30, 2014

I have to wonder if he was thinking about, for example, a reusable implementation of a hash table. And if we was, why in the world wouldn't he want that. Running with the hash table example: I use them many, many times a day, and if I had to reimplement them every time, I'd be sunk. Just a few moments ago, I wrote a script to check for duplicate entries that looked something like this:

  seen = {}
  elements.each do |element|
    if seen[element]
      raise "Duplicate element: #{element.inspect}"
    end
    seen[element] = true
  end

It's quick and dirty and isn't a shining example of architecture. But it found my duplicates and let me move on with my day. But what if I didn't have a reusable hash implementation to lean on? Would I even have attempted to write that script? Or would I have done my duplicate checking manually, wasting about an hour?

sheriff · on April 30, 2014

Looks like Ruby. Why use a Hash instead of the built-in Set class?

Jtsummers · on April 30, 2014

From your linked article:

  Andrew: A story states that you once entered a programming
  contest at Stanford (I believe) and you submitted the winning
  entry, which worked correctly after a single compilation. Is
  this story true? In that vein, today’s developers frequently
  build programs writing small code increments followed by
  immediate compilation and the creation and running of unit
  tests. What are your thoughts on this approach to software
  development?

  Knuth: As to your real question, the idea of immediate
  compilation and "unit tests" appeals to me only rarely, when
  I’m feeling my way in a totally unknown environment and need
  feedback about what works and what doesn’t. Otherwise, lots of
  time is wasted on activities that I simply never need to
  perform or even think about. Nothing needs to be "mocked up."

Amusing given the link prompting this whole thread.

dllthomas · on April 30, 2014

In the context where Knuth was writing, he was substantially correct. In languages where you can sufficiently encapsulate sufficiently abstract patterns, it ceases to be. When your abstractions don't leak, you don't need to touch the innards. As Rusky says, size also plays a role.

tel · on April 30, 2014

The very notion of a "non-leaky abstraction" is a tool that is only just beginning to be employed by programmers. At its most advanced level you're talking about bisimulation proofs over abstract data types.

We may get there one day, but for right now just spec'ing interfaces as combinations of consistent laws is a stretch.

dllthomas · on April 30, 2014

I'm not sure that's true. It would certainly be an interesting anthropological undertaking to pin down the details, but (at substantial risk of being wrong) I feel like this was an assumption in early attempts at abstraction and we only feel the need to specify "non-leaky" because we have discovered important things that previous attempts have - in practice - tended to leak.

tel · on May 2, 2014

I think I'm in agreement with you here—we're slowly, as a community, discovering how to make non-leaky abstractions. They've been around for a while in places where "in practice" had a lot of legroom (pure mathematics). CS has a lot of legroom too, but it's taken a long time for us to think about it in such a way to know where we can place our weight.

Rusky · on April 30, 2014

I think he's probably right about bigger, higher-level, or more domain-specific things. But there is a boundary- tools, languages, simple libraries, etc. can be, should be, and are reused to great effect.

narag · on April 30, 2014

I believe recurrent debates are people problems more than technical problems. Although news of DHH's denouncing TDD have been very popular, the technical debate over the details is the same boring story as ever. There are people that can't keep themselves clean from this kind of causes.

enginerd · on April 30, 2014

I agree with your (likely more than) partially formed thoughts; there is still a lot we don't know. In my case, I often work with CSS and TDD would not just add complexity, it conceptually does not make sense with my workflow. It is human nature to generalize solutions. But in our industry, we sometimes find ourselves generalizing 'development' when that is an enormity in and of itself.

From the article: "Besides, how do you know if the CSS is correct? Remember we are doing TDD. We are writing our tests first. How do you know, in advance, what the CSS should be?"

jayvanguard · on April 30, 2014

> as late as the 90s, code reuse was still an unsolved problem.

But that is about the social and business structures and economics around how computing matured. Zero to do with technology. The technology aspects of code re-use were worked out long ago.

grandalf · on April 30, 2014

While DHH's rant has spawned an interesting discussion, it feels to me like he's arguing in reverse in defense of his framework.

Many Rails apps are tightly coupled, and many unit tests written by developers using rails test 10% program logic and 90% framework features.

Of course this is going to be slow. We can argue about hacks to make it faster but at a certain point it's a problem whose solutions start to distract us from solving the important problems.

If you have a web app and get to write a single test to determine whether it's safe to deploy, that test would be an integration test.

The decision to write tests more granular than integration tests is a decision to be made based on assumptions about the rate of change of components of the system.

TDD is tangential to the above observation.

There are many cases where an implementation is easy to figure out (though possibly time consuming) while the optimal interface design is less obvious. TDD can be really useful to quickly iterate interfaces and verify that all the moving parts work as expected together, before worrying about the implementation details... This makes it possible to work on a larger system with more focus on problem solving, fewer mistakes, and less overall cognitive load.

jshen · on April 30, 2014

It's not slow to test a rails app. See DHH's latest post.

insensible · on April 30, 2014

In which he enumerates measurements of what are quite slow test runs by TDD standards.

jshen · on April 30, 2014

And TDD is of no use to me if it expects me to run my full test suite whenever I change a line of code. If my code isn't decoupled enough to run a subset of tests for a small change then my code is bad so there is no good reason for TDD to demand I run the full suite every 5 seconds.

grandalf · on April 30, 2014

It depends on how coupled your code is. Pure ruby application logic is really fast to test, so there is no reason not to test all of it.

But due to the high degree of coupling of logic to AR model code, callbacks, etc., you end up testing much of Rails along with your own logic, which makes the tests slow.

If your "business logic" is just Rails associations and one or two callbacks, then arguably (and this is what DHH is arguing, I think) it probably doesn't need to be unit tested. Other parts of your app are far more likely to break or behave unexpectedly.

However if you are doing any kind of nontrivial software design, unit tests + TDD can be extremely helpful.

jshen · on April 30, 2014

I don't think a full test suite that takes 4 minutes is slow, and I don't see much if any value in adding a bunch of abstractions and indirection for the purpose of running the full suite every 5 seconds. I get 99% of the value by limiting my test execution to the test file for the chunk of code I'm editing and occasionally running the full suite. Running the full suite after each edit is of very little value.

programminggeek · on April 30, 2014

I have a really crazy idea on how you could test the GUI in a way that would both save time and provide something more valuable than just testing true == true.

My idea is to create tests that take screenshots and do a diff of them over time. You then could set a change % threshold that would signal a test "failure" signal. Your QA team could then run through that process and see that something significant changed. Maybe that is fine, but maybe that is hugely unintended.

Having a Time Machine of screenshots of different processes, you could compare changes easily and see if they are worth further investigation. For example, this would be useful if you change some CSS or JS for just one page, and it ends up breaking another page.

The key point of this system is not that it would tell when your system is broken, but rather that there was a significant change that occurred that might have broken something. It's not a substitute for human analysis or thought.

Is anyone doing something like this and would it be useful to anyone else?

DougWebb · on April 30, 2014

My company's QA team set up a system like that around 1999. There were far too many false-positives, because the GUI intentionally changes all of the time during development. So instead of testing functionality, the team spent all of their time updating screenshots. The worst was when we made a very simple style change that affected every page in the application; they'd have to redo every single screenshot instead of just doing a 5-second test that results in "Yeah, the banner is the right shade of blue now, and I know it's used on every page."

forgottenpass · on April 30, 2014

Dunno why your comment is downvoted, this is a valuable testing strategy. Taking screenshots as an automated tool walks through your GUI is a great way to find regressions without writing tests for every last pixel. And you can run something like this on it's own or bolted onto existing test cases.

morganherlocker · on April 30, 2014

Facebook does this with a project called Huxley[1]. It seems cool, but these sorts of tools have always suffered from the problem of brittle tests, so it is not a silver bullet. It does seem that it would work well, however, in systems that require stringent oversight around UI changes (like facebook). Places where "if a piece of UI changes by a pixel, we want to know about it and OK the change" is the standard (most apps do not fall into this category though).

[1] https://github.com/facebook/huxley

lvturner · on April 30, 2014

I actually did pretty much exactly this for a previous job for testing a rendering engine - we had a 'golden master' set of screenshots, a bunch of code to render those original golden masters, then used perceptual diff (http://pdiff.sourceforge.net) to check against the golden masters.

It was a bit of an experiment and didn't get used that much - though it did come in handy when trying to write UI rendering that worked with XAML

projectileboy · on April 30, 2014

There are tools that do something similar. At Siemens we used a tool called T-Plan (t-plan.com) to test behavior of a railroad control system. It wasn't perfect, but it worked surprisingly well. eggplant (http://www.testplant.com/eggplant/testing-tools/) also looked good, but didn't happen to work for our system at the time (not really a comment on eggPlant).

bcgraham · on April 30, 2014

Houdini [1] aimed to do this, but it's been a while since I checked it out -- not sure if it launched or is still under development.

[1] http://www.tryhoudini.com

joshuacc · on April 30, 2014

You mean something like this? https://dpxdt-test.appspot.com/

lnanek2 · on April 30, 2014

This isn't really true. At smartphone OEMs we certainly do have boxes to put the devices in that perform physical tests like on the touch screen, microphones, speakers, antennas, etc.. And in mobile development we have UI automation tests that confirm buttons are certain color, have certain text or state, the right screens popup when pressed, etc. - heck we have a program called the monkey that presses everything that can be in addition to the UI automation scripts. I know the web side of things has Selenium and similar robots. I think he just hasn't ever worked somewhere where everything is tested which is reasonable. In many cases you have nothing to do with the OS your software is running on, for example, so there isn't as much point in testing beyond what your app outputs to it.

reedlaw · on April 30, 2014

I think what Uncle Bob is referring to are things like layout, color, and so on. Of course you can automate browser interactions with Selenium, but you can't easily catch layout changes, broken UI elements, or regressions. The only method I know of that can come close is automated screen capture comparison. But that wouldn't work perfectly and still requires human intervention to check out false positives.

dllthomas · on April 30, 2014

"The only method I know of that can come close is automated screen capture comparison. But that wouldn't work perfectly and still requires human intervention to check out false positives."

It would require human intervention in the case of a failure, to be sure, but those cases where it can guarantee I don't need to bother looking because my change didn't change anything visible in those screenshots is potentially a significant boon.

jimejim · on April 30, 2014

I'd argue that the test has value if it's central to what you're doing and will save you time in the long run. Not every test falls into that category, so it's a cost-benefit analysis.

He talks about fiddling with UI elements which is sometimes a one-time thing after you get it setup. Writing tests for that is sometimes a waste of time.

Now, if you have code that's going to do some form of complex screen manipulation and it's a big piece of what you're doing, it makes more sense to automate some tests.

fwanicka · on April 30, 2014

And do you write all those touch screen, antenna, microphone and speaker tests before you start writing any of the code?

zwieback · on April 30, 2014

True, we've spent a lot of time building machines with actuators, sensors, etc. to physically test UIs. Also, screen scraping to verify GUIs.

I get Uncle Bob's point, though, and I welcome what looks like a very reasonable peace offering to the zealots on the other side.

raverbashing · on April 30, 2014

These are integration tests really, not unit tests

Unless you do a small change then make the robot do everything.

(Also, this may also be used for device tests in the production line)

jshen · on April 30, 2014

"Over the years many people have complained about the so-called "religiosity" of some of the proponents of Test Driven Development. "

And Bob is guilty of that religiosity himself. See here https://www.youtube.com/watch?v=WpkDN78P884&feature=youtu.be...

Jump to the 58 minute mark if it doesn't automatically.

planetjones · on April 30, 2014

>> So near the physical boundary of the system there is a layer that requires fiddling. It is useless to try to write tests first (or tests at all) for this layer.

Maybe I can see what he's trying to say, but I don't think statement alone is accurate.

For the GUI i.e. at the human boundary of the system, the most value (especially to stop regression and catch side affects) is often added with tests e.g. automated tests which perform some user function in the GUI and assert the results.

Another physical boundary of the system is a database. Writing tests which cross this boundary add a lot of value too.

I'd favour these tests which hit the boundaries and go over them, over a codebase with only unit tests and endless mocking any day of the week.

Also these type of tests can be written first. We do it.

agentultra · on April 30, 2014

Unit tests don't have to test everything... just the units. Integration tests should be testing the interactions between different modules. And system tests should be the whole stack top-to-bottom. It ends up looking like a pyramid.

planetjones · on April 30, 2014

Yes thank you I know the testing pyramid. But testing should concentrate on what adds value and a zillion unit tests, with everything mocked often don't. Infact they can be a distraction from the bigger picture. Rather than rules, methodologies etc. we should TWMS - Test What Makes Sense. Or even better TWAV - Test What Adds Value. Or DTFTS - Don't Test For Testings' Sake.

agentultra · on April 30, 2014

If I understood correctly I think that's what Uncle Bob was trying to argue for in this blog post.

I don't think having 3-4 tests per LOC is such a bad thing. Far more testing goes into the sqlite codebase and I think it'd be hard to argue that it could have been better if the developers had stopped wasting their time and concentrated on what added value.

nsfyn55 · on April 30, 2014

This article is a breath of fresh air. I can't how many times I've encountered the dogmatic TDD adherent. I write more tests than anyone I know and what I have learned is TDD is great except when the cost of TDD outweighs its benefits. I've seen a dev spend 8 hours fiddling around with Mocha/Chai trying to test if a button changes color in response to a successful callback. Sometimes its good enough to click the button and see if it changes color.

nimblegorilla · on April 30, 2014

We've seen most of this argument before, but the most interesting (new) part of the article is the implication that CSS is the final layer between software and the physical world and thus hard to test. I'm sure the people on the Mozilla, Chrome, Opera, and IE projects would disagree that CSS is untestable.

It seems Uncle Bob implies that it's ok to skip TDD if you think it is hard to test something. There are much better reasons for most apps to avoid testing their CSS. Likewise there are many reasons for some projects to have extensive automated testing around CSS even if it might be hard.

asgard1024 · on April 30, 2014

"I have often compared TDD to double-entry bookkeeping."

It always seemed to me, if you were to make perfect, automated tests that 100% cover your application, you would basically have reimplemented it. (Or in other words - if you want to check if your calculations are correct, you have to do the calculations again.) Ideal, fully automated tests are basically taking two implementations, run them side-by-side, and compare the results.

That's why I am not a big fan of tests, in the sense that there is too much focus on them in the SW industry, and they seem like a hammer (useful but overused).

I think there should be more focus on writing the _one_ implementation correctly. This can be done with better abstractions (e.g. actor model for concurrency, functional programming, ..) and asserts (programming by contract), and maybe even automated SW proving. I don't think these techniques are as popular as testing, but I wish they were more popular, because they let you write programs only once and correctly.

[Update: To specifically expand on point about asserts, if you can trivially convert test to assert, why not do it? Unfortunately tooling doesn't support asserts as much as tests.]

jdlshore · on April 30, 2014

You're right in a sense; in TDD, you're basically saying a thing, and saying the inverse of a thing, and then making sure they line up. I find that my codebases are about 50% test code and 50% production code.

Your implication that this is wasteful is wrong, though. My production code is smaller when I work this way because I do more refactoring, and for projects that live longer than a month or two, I go faster.

By the way, TDD is absolutely in the same vein as design by contract and formal proofs. Both involve saying the same thing twice, in two different ways. TDD is a sloppier, more practical version of the same basic idea. See "Worse is Better."

jimejim · on April 30, 2014

Yes, some people argue that your tests are more like a specification, so imagine that every feature would have tests. Some even argue for specifying things like constants, which runs contrary to what Martin says in this article.

Overkill, IMHO, but tests have their place.

I tend to look at it now as test stuff that's being used often or stuff that may break easily when I'm making changes. At a certain point, automation of key tests make sense if you're wasting too much time manually testing.

williamcotton · on April 30, 2014

A good tool for testing visual interfaces is an image diff combined with a manual testing process.

Initially the tester will view each screenshot of an application state that is being tested and set that as "passing". Next an automated test runs and the latest screenshots are compared to the passing screenshots.

If they are different then the test fails. A manual tester then needs to take a look at the tests that failed and decide if the test actually failed or if the changes were supposed to be there.

If the changes were supposed to be there the tester can make this image the new passing screenshot. Passing screenshots should probably be reset BEFORE the tests are run. I see no reason why not to just check these images in to the repo along with all of the other test conditions.

I've been scheming on ways to do video diffs for testing transitions and animations although I'm not sure if this provides much value. It would be mostly an academic pursuit.

mbrock · on April 30, 2014

There seems to be two aspects to this discussion.

One involves questions of process, enforcing TDD, and whether or not TDD can save a bad team from producing bad stuff. The other is the question of what TDD can offer for a skilled team with an intelligent approach to development.

Mixing these aspects leads people to dismiss TDD because they've seen teams fail by doing TDD in a bad way.

Another question: is there a way to structure software so that questions of boundaries and collaborators become less troublesome for testing? I think a promising road is in value-oriented programming without side effects.

Another way to see that: if you need a lot of tedious mocking to test your unit, maybe the unit should be redesigned to have fewer collaborators, or maybe you should move the complex logic to a pure function, and so on. Maybe TDD difficulties are showing us that there is something wrong with how we write code. After all that's what it's supposed to do.

lclarkmichalek · on April 30, 2014

So you're telling me that being pragmatic will result in sensible solutions? I could do with more blog posts like this!

dllthomas · on April 30, 2014

"So near the physical boundary of the system there is a layer that requires fiddling. It is useless to try to write tests first (or tests at all) for this layer. The only way to get it right is to use human interaction; and once it's right there's no point in writing a test."

This seems dead wrong. There is probably no way to write tests first in this environment, but with so many different browsers interpreting your CSS (to run with preceding example) you need to be aware of when changes in your code cause changes in rendering that might need to be revalidated and further fiddled! I do agree that it doesn't fit well with TDD, but it absolutely can work with automated testing.

_ikke_ · on April 30, 2014

He is talking about TDD / unittests in this case. He is not saying anything about automated testing in general.

dllthomas · on April 30, 2014

He is focusing on TDD, but in a few places he clearly generalizes. In the quoted bit, the "(or tests at all)", and a paragraph down we have,

"Anything that requires human interaction and fiddling to get correct [...] doesn't require automated tests."

Again, I believe I agree with the narrower case of TDD. I was objecting to the generalization.

DanielBMarkham · on April 30, 2014

"...software controls machines that physically interact with the world..."

See, that's not always true. I would love it if all software interacted with the outside world. But a lot of software doesn't interact -- just take a look at some of that code sitting in your repository sometime. Some of that isn't deployed, isn't being used. You could test that until the cows come home and have a whole bucket full of nothing.

Because the Bobster and the other TDD guys are correct: you gotta test to know that the code is doing what it's supposed to. Testing has to come first. In a way, the test is actually more important than the code. If you get the tests right, and the code passes them, the code itself really doesn't matter.

Where we fall down is when we confuse the situation of a commercial software development team working on WhipSnapper 2.0 with a startup team working on SnapWhipper 0.1. The commercial guys? They are working on a piece of code with established value, with a funding agent in place, with a future of many years (hopefully) in production. Everything they create will be touched and used over a long period of time. The startup guys? They've got a 1-in-10 shot that they're alive next year. Any energy they put into solving a problem that hasn't been economically validated is 90% likely to be wasted.

Tests are important, but only when you're testing the right thing. The test for the startup guys is a business test, not a code test. Is this business doing something useful? If so, then just about any kind of way of accomplishing that -- perhaps without any programming at all -- provides business value.

That's a powerful lesson for startup junkies to assimilate. In the startup world, you don't get rewarded based on the correctness or craftsmanship of your code. You're looking at one or two weeds instead of realizing the entire yard needs work.

Put a different way, we have Markham's Law: The cost of Technical Debt can never exceed the economic value of the software to begin with.

</standard TDD comment>

robmcm · on April 30, 2014

I have never found it to work in visual/interactive development. A lot of the time you are working on something you evolve as you develop, try, iterate again.

I can see it's benefits if you have a simpler I/O for your code.

pornel · on April 30, 2014

TDD for CSS is indeed an odd concept, but it's possible to do automated CSS regression testing:

http://tldr.huddle.com/blog/css-testing/

harel · on April 30, 2014

As a non religious person, there's one thing I don't get in the whole to TDD or not to TDD debate that's ongoing now.

Does it matter if "TDD says this or says that"? Aren't these methodologies more of 'suggestions' for us to adopt as it fits our needs, while trimming the stuff that doesn't? Once you adhere to a methodology religiously you lose the flexibility and pragmatism that methodology intended to give you. It just becomes systematic Dogma following of a rule book, like any religion.

dllthomas · on April 30, 2014

"How can I test that the right stuff is drawn on the screen? Either I set up a camera and write code that can interpret what the camera sees, or I look at the screen while running manual tests."

Or screenshots, of course, which is still relying on code obviously but probably not your code. Of course, defining what you're looking for in a screen shot is going to be nontrivial unless you're doing simple check that it hasn't changed from the last manually-approved version or something.

platz · on April 30, 2014

I am reminded of this somewhat recent discussion when TDD doesn't work (which I believe uncle Bob responded to as well)

https://news.ycombinator.com/item?id=7130765

tempodox · on April 30, 2014

Why is it that we need articles like this to tell us that there is no silver bullet and we had better use our brains instead of the Methodology Du Jour? Sometimes, HN just makes me want to cry...

nsfyn55 · on April 30, 2014

I think critical programmers from every generation back to Ada Lovelace have made this observation. When the solution has started costing more than the problem...stop.

dllthomas · on April 30, 2014

"However, software is different from accounting in one critical way: software controls machines that physically interact with the world."

Doesn't accounting?

pjmlp · on April 30, 2014

So the acceptance that there are lots of scenarios where one cannot write tests first.

A good post for the TDD evangelists I have met so far.

raverbashing · on April 30, 2014

"But if I want to be sure that the bell rings when the proper signals are sent to the driver, I either have to set up that microphone or just listen to the bell.

How can I test that the right stuff is drawn on the screen? Either I set up a camera and write code that can interpret what the camera sees, or I look at the screen while running manual tests."

I have nothing else to add

tonyarkles · on April 30, 2014

Or you programmatically capture the screen output and compare it to a known-good capture. Not the best solution, but it's a solution that I've seen used. 99% of the time, the screen output is the same, and occasionally it'll be different and require human intervention to determine whether the change is right or wrong.

Getting notified when things aren't what you'd expect them to be is pretty valuable.

JabavuAdams · on April 30, 2014

This can work well for form-like things, but how would you do this for a 3d game, e.g.

Hmm. Computer vision project #346.

fixermark · on April 30, 2014

http://www.sikuli.org/

We used this in a videogame engine test framework to verify specific game states against gold-master images. It was particularly useful for verifying when our physics engine had changed in subtle ways; one of our tests involved dropping a couple of boxes on top of each other and verifying where they landed.

All kinds of stuff would disrupt that test---which made it great for knowing when we'd changed something subtle that would have real impact on our game engine's users.

raverbashing · on April 30, 2014

Yeah, this is complicated, it may have changed but it may still be correct

like a value of 15.000002 instead of 15.000013 but errors may accumulate in the end.

JabavuAdams · on May 1, 2014

Yeah, I love Sikuli. I've only used it to automate clicking through game GUI, not for anything in the game scene itself.