PUBLIC   marks

PUBLIC MARKS with tag parser

Sponsorised links

December 2009

enriquepablo / nl / wiki / Home — bitbucket.org

by karlcow

nl is a python library, that exposes a declarative API that allows us to build sentences and rules. These are used as input for a knowledge base built on the CLIPS production system. CLIPS builds a Rete network with the rules and sentences, which can then be queried for the consecuences of those in a most efficient way.

The main claim of nl is to offer a syntax that can accommodate any coherent theory that we may build with the natural language (in the same sense as something like the semantic web's OWL-Full would), while at the same time being based on a simple finite domain first order theory. This theory is NL, a discussion of which can be found here. This discussion is probably required reading to understand the breadth and the limits of nl, but not to start using it.

November 2009

ry's http-parser at master - GitHub

by karlcow

This is a parser for HTTP messages written in C. It parses both requests and responses. The parser is designed to be used in performance HTTP applications. It does not make any allocations, it does not buffer data, and it can be interrupted at anytime. It only requires about 128 bytes of data per message stream (in a web server that is per connection).

Sponsorised links

October 2009

Whatpm::HTML - An HTML Parser and Serializer

by karlcow

Whatpm::HTML - An HTML Parser and Serializer

rdfa_parser | gemcutter | awesome gem hosting

by karlcow

Yields each triple, or generate in-memory graph

pyparsing

by karlcow
The pyparsing module is an alternative approach to creating and executing simple grammars, vs. the traditional lex/yacc approach, or the use of regular expressions. With pyparsing, you don't need to learn a new syntax for defining grammars or matching expressions - the parsing module provides a library of classes that you use to construct the grammar directly in Python.

ONLamp.com: Building Recursive Descent Parsers with Python

by karlcow

What is "parsing"? Parsing is processing a series of symbols to extract their meaning. Typically, this means reading the words of a sentence and drawing information from them. When application programs need to process data that is provided as text, they must use some form of parsing logic. This logic scans the text characters and character groups (words) and recognizes patterns of groups to extract the underlying commands or information.

Snowball

by karlcow & 1 other

Snowball is a small string processing language designed for creating stemming algorithms for use in Information Retrieval. This site describes Snowball, and presents several useful stemmers which have been implemented using it.

August 2009

Character encoding detection for external scripts

by karlcow

This is (EF BB BF) C3 B6 3D 22 21 22 loaded into browsers under various labels. That happens to be properly formed ECMAScript code for all the encodings used. The bogus results for Opera9 can easily be reproduced in context of the testing script, but probably not individually from a clean cache; what's going on there is unknown. I also noted in running these tests that Opera claims "Opera supports the entire ECMA-262 2nd and 3rd standards with no exceptions" while in fact their implementation does not, the parser rejects code that follows the IdentifierStart :: UnicodeEscapeSequence production of ECMA-262 section 7.6. Instead it implements Opera-only extensions, like comma-free arrays ala [ 1 2 3 ]. Other fun facts include: IE does not implement onload for iframes and cannot modify the innerHTML or tr elements; Firefox ignores "tags" when setting the innerHTML of dynamically created tr elements with no ownerElement... Oh and Opera again needs /th "tags" so it won't nest adjacent th elements when setting innerHTML.

RDFa Fragment Parser

by karlcow

Paste a chunk of XHTML RDFa below, and click "Parse."

make sure you do the right thing for RDFa validation when you eventually place this chunk inside a web page

July 2009

Sparkles everywhere, CubicWeb gets fizzy (CubicWeb's Forge)

by karlcow

Fyzz parses the SPARQL query and generates something we decided to call an AST although it's still a bit rough for now. Fyzz understands simple triples, distincts, limits, offsets and other basic functionalities.

John Resig - HTML 5 Parsing

by karlcow

If you're interested in giving the new parser a try (it's doubtful that you'll see many obvious changes - but any help in hunting down bugs would be appreciated) you can download a nightly of Firefox, open about:config, and set html5.enable to true.

May 2009

Python Package Index : pyWxSVG 0.1

by karlcow

View and print svg file or svg content, convert svg to raster graphics. Partial support svg format. Tested with Python 2.5 and wxPython 2.8.9.2. Drawing use wx.GraphicsContext class. Path parser from Enable - SVGPathParser class.

March 2009

RFC (2)822 & 3696 Email Address Parser in PHP

by karlcow

The test suite shows results for each parser, based on these test definitions. These are borrowed from Dominic Sayers who has a similar parser. We are still arguing over certain tests ;)

February 2009

Les parsers HTML5 - La Tortue Cynique / The Cynical Turtle

by karlcow

Bref, on a donc besoin d'un parser spécifique (après 30 ans à travailler avec des parsers génériques GML et SGML),

January 2009

November 2008

PHP Simple HTML DOM Parser

by srcmax & 7 others , 3 comments
  • A HTML DOM parser written in PHP5+ let you manipulate HTML in a very easy way!
  • Require PHP 5+.
  • Supports invalid HTML.
  • Find tags on an HTML page with selectors just like jQuery.
  • Extract contents from HTML in a single line.

PUBLIC TAGS related to tag parser

no tag

Sponsorised links