August 2006
Ariel
a library that allows you to extract information from semi-structured documents (such as websites). Ariel will use a small number of labeled examples to generate and learn effective extraction rules.
July 2006
Scraping with style: scrAPI toolkit for Ruby
Scraping with Ruby using CSS selectors.
June 2006
SIMILE | Piggy Bank | How to Write Screen Scrapers
A screen scraper in Piggy Bank is a piece of software code that extracts “pure” information from within a web page’s content (and perhaps from related web pages). Screen scrapers can be implemented as XSL templates or in Javascript. This document fo
1
(6 marks)