Calling it a "prominent open source project" is an understatement. Granted it's not perfect, it's the de facto standard in FOSS OCR software. When mentioning "tesseract" to just about anyone in the data mining/machine learning/artificial intelligence communities (which is pretty much the target user base), they will automatically think you're referring to the OCR software.