Sponsorised links
December 2009
enriquepablo / nl / wiki / Home — bitbucket.org
nl is a python library, that exposes a declarative API that allows us to build sentences and rules. These are used as input for a knowledge base built on the CLIPS production system. CLIPS builds a Rete network with the rules and sentences, which can then be queried for the consecuences of those in a most efficient way.
The main claim of nl is to offer a syntax that can accommodate any coherent theory that we may build with the natural language (in the same sense as something like the semantic web's OWL-Full would), while at the same time being based on a simple finite domain first order theory. This theory is NL, a discussion of which can be found here. This discussion is probably required reading to understand the breadth and the limits of nl, but not to start using it.
November 2009
ry's http-parser at master - GitHub
This is a parser for HTTP messages written in C. It parses both requests and responses. The parser is designed to be used in performance HTTP applications. It does not make any allocations, it does not buffer data, and it can be interrupted at anytime. It only requires about 128 bytes of data per message stream (in a web server that is per connection).
Sponsorised links
October 2009
Whatpm::HTML - An HTML Parser and Serializer
Whatpm::HTML - An HTML Parser and Serializer
rdfa_parser | gemcutter | awesome gem hosting
Yields each triple, or generate in-memory graph
pyparsing
ONLamp.com: Building Recursive Descent Parsers with Python
What is "parsing"? Parsing is processing a series of symbols to extract their meaning. Typically, this means reading the words of a sentence and drawing information from them. When application programs need to process data that is provided as text, they must use some form of parsing logic. This logic scans the text characters and character groups (words) and recognizes patterns of groups to extract the underlying commands or information.
August 2009
Character encoding detection for external scripts
This is (EF BB BF) C3 B6 3D 22 21 22 loaded into browsers under various labels. That happens to be properly formed ECMAScript code for all the encodings used. The bogus results for Opera9 can easily be reproduced in context of the testing script, but probably not individually from a clean cache; what's going on there is unknown. I also noted in running these tests that Opera claims "Opera supports the entire ECMA-262 2nd and 3rd standards with no exceptions" while in fact their implementation does not, the parser rejects code that follows the IdentifierStart :: UnicodeEscapeSequence production of ECMA-262 section 7.6. Instead it implements Opera-only extensions, like comma-free arrays ala [ 1 2 3 ]. Other fun facts include: IE does not implement onload for iframes and cannot modify the innerHTML or tr elements; Firefox ignores "tags" when setting the innerHTML of dynamically created tr elements with no ownerElement... Oh and Opera again needs /th "tags" so it won't nest adjacent th elements when setting innerHTML.
RDFa Fragment Parser
Paste a chunk of XHTML RDFa below, and click "Parse."
make sure you do the right thing for RDFa validation when you eventually place this chunk inside a web page
July 2009
Sparkles everywhere, CubicWeb gets fizzy (CubicWeb's Forge)
Fyzz parses the SPARQL query and generates something we decided to call an AST although it's still a bit rough for now. Fyzz understands simple triples, distincts, limits, offsets and other basic functionalities.
fyzz (fyzz is a sparkling Python parser for the Sparql query language) (Logilab.org)
fyzz is a sparkling Python parser for the Sparql query language
John Resig - HTML 5 Parsing
If you're interested in giving the new parser a try (it's doubtful that you'll see many obvious changes - but any help in hunting down bugs would be appreciated) you can download a nightly of Firefox, open about:config, and set html5.enable to true.
May 2009
Python Package Index : pyWxSVG 0.1
View and print svg file or svg content, convert svg to raster graphics. Partial support svg format. Tested with Python 2.5 and wxPython 2.8.9.2. Drawing use wx.GraphicsContext class. Path parser from Enable - SVGPathParser class.
March 2009
RFC (2)822 & 3696 Email Address Parser in PHP
The test suite shows results for each parser, based on these test definitions. These are borrowed from Dominic Sayers who has a similar parser. We are still arguing over certain tests ;)
February 2009
Les parsers HTML5 - La Tortue Cynique / The Cynical Turtle
Bref, on a donc besoin d'un parser spécifique (après 30 ans à travailler avec des parsers génériques GML et SGML),
January 2009
November 2008
PHP Simple HTML DOM Parser
- A HTML DOM parser written in PHP5+ let you manipulate HTML in a very easy way!
- Require PHP 5+.
- Supports invalid HTML.
- Find tags on an HTML page with selectors just like jQuery.
- Extract contents from HTML in a single line.
