[Home] [By Thread] [By Date] [Recent Entries]
Document Imaging System sites now make reference to fuzzy logic technologies that can clean up bad OCR output to some acceptable rate, so affordable technology could help out here. One remaining problem is trusting the fuzzy logic to have not altered the original. This is similar to the issue of legal document image fidelity: many systems don't accept a document as legal if it can have been altered by any means, aka, the identity problem. The application of a technology can't usually be divorced from the content it operates over. As long as the identity requirement doesn't enter in, the process can be lightly defined and the fast fingered typist is as good as the OCR all other things being equal (which they aren't but that's a longer story). So we come back around to simple is ok until you have strict requirements and money on the table. On the other hand, any project I've ever worked on that relied on volunteer effort had to be simple or the predictor for success was very low. I wonder if the Web Design Principles are different with large well-funded organizations in the loop. Consider the NASA effect: when an engineering organization transforms into an engineering project management organization, does the quality of the product change or only the rate of the process? len -----Original Message----- From: Jeff Greif [mailto:jgreif@a...] It's also a question of volume. A 1% error rate that needs human cleanup is not a big deal when you only see 100 docs per day, but it mounts up when there a million. Analogy: A friend is slowly scanning and turning into PDF files all the reprints and preprints (in planetary science) that he's collected since the late 1960's. He runs the scanner more or less continuously while at home, and takes the files produced on his laptop when he travels, and does a sort of desultory fixup of the OCR (since he has the page images as well) as lulling airplane activity. Serious fixup occurs when he actually has to consult the paper for details. From: "Paul Prescod" <paul@p...> > Having computers and humans working together is great. But you seem to > propose that users should be required to handle the exceptional cases > that computers handle poorly. I'd suggest instead that the users would > rather work with programmers (or visual mapping tools) to automate away > those exceptional cases so that they can be freed up to do creative > work.
|

Cart



