public marks

PUBLIC MARKS from falko with tag ocr

August 2007

Optical Character Recognition With Tesseract OCR On Ubuntu 7.04 | HowtoForge - Linux Howtos and Tutorials

This guide describes how to set up Tesseract OCR on Ubuntu 7.04. OCR means "Optical Character Recognition". The resulting system will be able to convert images with embedded text to text files. Tesseract is licensed under the Apache License v2.0.

February 2007

Fight Image Spam With FuzzyOCR And SpamAssassin On Debian/Ubuntu | HowtoForge - Linux Howtos and Tutorials

(via)
This tutorial describes how to scan emails for image spam with FuzzyOCR. FuzzyOCR is a plugin for SpamAssassin which is aimed at unsolicited bulk mail containing images as the main content carrier. Using different methods, it analyzes the content and properties of images to distinguish between normal mails (ham) and spam mails. FuzzyOCR tries to keep the system load low by scanning only mails that have not already been categorized as spam by SpamAssassin, thus avoiding unnecessary work.

falko's TAGS related to tag ocr

debian +   fuzzyocr +   gimp +   image spam +   imagemagick +   linux +   Optical Character Recognition +   scanner +   server +   spam +   spamassassin +   tesseract +   ubuntu +