FYI: You should reply using the reply links. It's easier to follow threads that way.
Also, does the matching handle blanks or minor variations? Typical clause libraries often taken this form: "The Company hereby agrees to sell you [_________________] shares of stock." You probably wouldn't need NLP to match that.
Sorry about not using the reply. I was wondering why my comment was on top.
Right now I've got some off the shelf NLP stuff that does Org and Name recognition to remove those things (I've been working on a similar project for a while). The lines should be trivial but not yet implemented. Most "get screwed" clauses don't have underlines as they are boiler plate.
Also, does the matching handle blanks or minor variations? Typical clause libraries often taken this form: "The Company hereby agrees to sell you [_________________] shares of stock." You probably wouldn't need NLP to match that.