public marks

PUBLIC MARKS with tag taln


NLTK Home (Natural Language Toolkit)

by parmentierf (via)
Open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.


hutchens_phd_1999_the_upwrite_predictor.pdf (Objet application/pdf)

by parmentierf (via)
The UpWrite Predictor: A General Grammatical Inference Engine for Symbolic Time Series with Applications in Natural Language Acquisition and Data Compression Blog: Query interfaces for the Semantic Web

by parmentierf
An interesting presentiation at Google Tech Talks about different interfaces to query semantic data. Casual users were presented with 4 increasingly formal systems: keyword search, natural language search, controlled language search, and a graphical interface to build query patterns. Interestingly enough, the users liked natural language best, although keyword queries gave more accurate results.

uClassify - free text classifier web service

by parmentierf
uClassify is a free web service where you can easily create your own text classifiers. Examples * Spam filter * Web page categorization * Automatic e-mail support * Language detection * Written text gender recognition * Sentiment * See below for some examples So what do you want to classify on? Only your imagination is the limit!

Home | OpenCalais

by parmentierf & 2 others
We want to make all the world's content more accessible, interoperable and valuable. Some call it Web 2.0, Web 3.0, the Semantic Web or the Giant Global Graph - we call our piece of it Calais. Calais is a rapidly growing toolkit of capabilities that allow you to readily incorporate state-of-the-art semantic functionality within your blog, content management system, website or application.


by parmentierf (via)
NLP Tools for free (GPL) : * LIA PHON: a French Text-To-Phoneme converter * LIA: TAGG a French and English tagger, lemmatizer, bracketter + a reaccentueur (for French) * LIA SCT: an implementation of Semantic Classification Tree, with some extra features * LIA NE (beta): a simple Named-Entity tagger for French based on CRF++

Benoît Sagot - WOLF

by parmentierf
Le WOLF (Wordnet Libre du Français) est une ressource lexicale sémantique (wordnet) libre pour le français.

Proxem > Home

by parmentierf (via)
Proxem isn't just a technology. It's a company and a vision informed by creativity and passion for all things possible with Natural Language Processing.

Cypher - Beta Release — - ai - ai software - semantic web - semantic web software - ai company - natural language processing - natural language processing software - RDF, FOAF, Friend of a Friend, DC, Dublin Core, RSS, SeRQL and SPARQL softwa

by parmentierf (via)
The Cypher™ beta release is the AI software program available which generates the .rdf (RDF graph) and .serql (SeRQL query) representation of a plain language input, allowing users to speak plain language to update and query databases. With robust definition languages, Cypher's grammar and lexicon can quickly and easily be extended to process highly complex sentences and phrases of any natural language, and can cover any vocabulary. Equipped with Cypher, programmers can now begin building next generation semantic web applications that harness what is already the most widely used tool known to man - natural language.


Double Metaphone - Wikipédia

by parmentierf (via)
Le Double Metaphone est un algorithme de recherche phonétique écrit par Lawrence Philips et est la deuxième génération de l'algorithme Metaphone. Son implémentation a été décrite en juin 2000 dans le magazine C/C Users Journal. Il est appelé « Double » car il peut retourner un code primaire et secondaire pour une chaîne de caractères (String) ; cela compte pour des cas ambigus ou pour des variantes multiples avec des ascendances communes. Par exemple, l'encodage du nom « Smith » rapporte le code primaire SM0 et le code secondaire XMT, lorsque le nom « Schmidt » rapporte le code primaire XMT et le code secondaire de SMT ; les deux ont XMT en commun.

TEL :: [tel-00145147, version 1] Définitions et caractérisations de modèles à base d'analogies pour l'apprentissage automatique des langues naturelles

by parmentierf
Définitions et caractérisations de modèles à base d'analogies pour l'apprentissage automatique des langues naturelles


Présentation de Theuth et de Blue Moon

by parmentierf (via)
Présentation de Theuth et de Blue Moon. Un nouveau type d'algo de parsing, dit "asyntagmatique". Sans entrer dans les détails, le fait que le parsing soit asyntagmatique débloque tout : on peut désormais tenir compte des contextes, comprendre les déictiques, détecter les jeux de mots et les contrepèteries, reconnaître la langue d'un texte ou traduire des textes où plusieurs langues sont mélangées, y compris dans la même phrase.

Nick Montfort's Computer and Information Science Work

by parmentierf

Official Google Research Blog: All Our N-gram are Belong to You

by parmentierf & 1 other (via)
Here at Google Research we have been using word n-gram models for a variety of R&D projects, such as statistical machine translation, speech recognition, spelling correction, entity detection, information extraction, and others. While such models have usually been estimated from training corpora containing at most a few billion words, we have been harnessing the vast power of Google's datacenters and distributed processing infrastructure to process larger and larger training corpora. We found that there's no data like more data, and scaled up the size of our data by one order of magnitude, and then another, and then one more - resulting in a training corpus of one trillion words from public Web pages.

Python for Linguistics - py4lx

by parmentierf (via)
This is an ickle collection of tutorials on using Python for doing interesting stuff with (human!) languages. They are posted initially on Hacklog, the Blogamundo developer blog, and then moved here, where they are endlessly tweaked to remove embarrassing errors improve clarity. In theory they should be doable by folks with no programming background, or just a little.

Chatterbots, Tinymuds, and the Turing Test

by parmentierf (via)
This paper describes the development of one such Turing System, including the technical design of the program and its performance on the first three Loebner Prize competitions. We also discuss the program's four year development effort, which has depended heavily on constant interaction with people on the Internet via Tinymuds (multiuser network communication servers that are a cross between role-playing games and computer forums like CompuServe). Finally, we discuss the design of the Loebner competition itself, and address its usefulness in furthering the development of Artificial Intelligence.

Active users

last mark : 14/09/2009 12:00

last mark : 11/07/2007 01:55