public marks

PUBLIC MARKS from ogrisel

February 2009

CS 114b - Course on NLP with NLTK assignement

Course Overview Provides a fundamental understanding of the problems in natural language understanding by computers, and the theory and practice of current computational linguistic systems. Of interest to students of artificial intelligence, algorithms, and the computational processes of comprehension and understanding.

October 2008

Conditional Random Fields

Conditional random fields (CRFs) are a probabilistic framework for labeling and segmenting structured data, such as sequences, trees and lattices. The underlying idea is that of defining a conditional probability distribution over label sequences given a particular observation sequence, rather than a joint distribution over both label and observation sequences. The primary advantage of CRFs over hidden Markov models is their conditional nature, resulting in the relaxation of the independence assumptions required by HMMs in order to ensure tractable inference. Additionally, CRFs avoid the label bias problem, a weakness exhibited by maximum entropy Markov models (MEMMs) and other conditional Markov models based on directed graphical models. CRFs outperform both MEMMs and HMMs on a number of real-world tasks in many fields, including bioinformatics, computational linguistics and speech recognition.

August 2008


Topics: Energy models, causal generative models vs. energy models in overcomplete ICA, contrastive divergence learning, score matching, restricted Boltzmann machines, deep belief networks

Modular toolkit for Data Processing (MDP)

Modular toolkit for Data Processing (MDP) is a Python data processing framework. Implemented algorithms include: Principal Component Analysis (PCA), Independent Component Analysis (ICA), Slow Feature Analysis (SFA), Independent Slow Feature Analysis (ISFA), Growing Neural Gas (GNG), Factor Analysis, Fisher Discriminant Analysis (FDA), Gaussian Classifiers, and Restricted Boltzmann Machines. Read the full list.

lasvm [Léon Bottou]

LASVM is an approximate SVM solver that uses online approximation. It reaches accuracies similar to that of a real SVM after performing a single sequential pass through the training examples. Further benefits can be achieved using selective sampling techniques to choose which example should be considered next. As show in the graph, LASVM requires considerably less memory than a regular SVM solver. This becomes a considerable speed advantage for large training sets. In fact LASVM has been used to train a 10 class SVM classifier with 8 million examples on a single processor.

June 2008

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation [PDF]

Recently, several learning algorithms relying on models with deep architectures have been proposed. Though they have demonstrated impressive performance, to date, they have only been evaluated on relatively simple problems such as digit recognition in a controlled environment, for which many machine learning algorithms already report reasonable results. Here, we present a series of experiments which indicate that these models show promise in solving harder learning problems that exhibit many factors of variation. These models are compared with well-established algorithms such as Support Vector Machines and single hidden-layer feed-forward neural networks.

YouTube - Visual Perception with Deep Learning

A long-term goal of Machine Learning research is to solve highly complex "intelligent" tasks, such as visual perception auditory perception, and language understanding. To reach that goal, the ML community must solve two problems: the Deep Learning Problem, and the Partition Function Problem. There is considerable theoretical and empirical evidence that complex tasks, such as invariant object recognition in vision, require "deep" architectures, composed of multiple layers of trainable non-linear modules. The Deep Learning Problem is related to the difficulty of training such deep architectures. Several methods have recently been proposed to train (or pre-train) deep architectures in an unsupervised fashion. Each layer of the deep architecture is composed of an encoder which computes a feature vector from the input, and a decoder which reconstructs the input from the features. A large number of such layers can be stacked and trained sequentially, thereby learning a deep hierarchy of features with increasing levels of abstraction. The training of each layer can be seen as shaping an energy landscape with low valleys around the training samples and high plateaus everywhere else. Forming these high plateaus constitute the so-called Partition Function problem. A particular class of methods for deep energy-based unsupervised learning will be described that solves the Partition Function problem by imposing sparsity constraints on the features. The method can learn multiple levels of sparse and overcomplete representations of data. When applied to natural image patches, the method produces hierarchies of filters similar to those found in the mammalian visual cortex. An application to category-level object recognition with invariance to pose and illumination will be described (with a live demo). Another application to vision-based navigation for off-road mobile robots will be described (with videos). The system autonomously learns to discriminate obstacles from traversable areas at long range.

DeepLearningWorkshopNIPS2007 < Public < TWiki

Theoretical results strongly suggest that in order to learn the kind of complicated functions that can represent high-level abstractions (e.g. in vision, language, and other AI-level tasks), one may need "deep architectures", which are composed of multiple levels of non-linear operations (such as in neural nets with many hidden layers). Searching the parameter space of deep architectures is a difficult optimization task, but learning algorithms (e.g. Deep Belief Networks) have recently been proposed to tackle this problem with notable success, beating the state-of-the-art in certain areas. This workshop is intended to bring together researchers interested in the question of deep learning in order to review the current algorithms' principles and successes, but also to identify the challenges, and to formulate promising directions of investigation. Besides the algorithms themselves, there are many fundamental questions that need to be addressed: What would be a good formalization of deep learning? What new ideas could be exploited to make further inroads to that difficult optimization problem? What makes a good high-level representation or abstraction? What type of problem is deep learning appropriate for? The workshop presentation page show selected links to relevant papers (PDF) on the topic.

YouTube - The Next Generation of Neural Networks

In the 1980's, new learning algorithms for neural networks promised to solve difficult classification tasks, like speech or object recognition, by learning many layers of non-linear features. The results were disappointing for two reasons: There was never enough labeled data to learn millions of complicated features and the learning was much too slow in deep neural networks with many layers of features. These problems can now be overcome by learning one layer of features at a time and by changing the goal of learning. Instead of trying to predict the labels, the learning algorithm tries to create a generative model that produces data which looks just like the unlabeled training data. These new neural networks outperform other machine learning methods when labeled data is scarce but unlabeled data is plentiful. An application to very fast document retrieval will be described.

Neurophilosophy : An overview of corticogenesis

The winners of the first Kavli Prize were announced a couple of weeks ago. One of the three recipients of the prize for neuroscience was Pasko Rakic, a professor of neurobiology and neurology at the Yale School of Medicine. Rakic has spent most of his career investigating the development of the cerebral cortex of man and other mammals, and it is for his outstanding contribution to this area of research that he has been awarded the Kavli Prize for Neuroscience.

TimeSeries - Scikits - Sicpy extension for timeseries in python

The TimeSeries scikits module provides classes and functions for manipulating, reporting, and plotting time series of various frequencies.

Linux development on the PlayStation 3, Part 1: More than a toy

The Sony PlayStation 3 (PS3) runs Linux®, but getting it to run well requires some tweaking. In this article, first in a series, Peter Seebach introduces the features and benefits of PS3 Linux, and explains some of the issues that might benefit from a bit of tweaking

PlotKit - Javascript Chart Plotting | liquidx

by 18 others
PlotKit is a Chart and Graph Plotting Library for Javascript. It has support for HTML Canvas and also SVG via Adobe SVG Viewer and native browser support.

What Dictionaries and Optical Illusions Say About Our Brains: Scientific American

Although many neuroscientists are trying to figure out how the brain works, Mark Changizi is bent on determining why it works that way. In the past, the assistant professor of cognitive science at Rensselaer Polytechnic Institute has demonstrated that the shapes of letters in 100 writing systems reflect common ones seen in nature...

John Resig - Processing.js

by 5 others
The Processing visualization language ported to JavaScript, using the Canvas element.

How to broadcast a live video stream

Technical notes on how the live video broadcast of the 2008 edition of Pycon FR.

May 2008

Mind Hacks: Do Bayesian statistics rule the brain?

This week's New Scientist has a fascinating article on a possible 'grand theory' of the brain that suggests that virtually all brain functions can be modelled with Bayesian statistics - an approach discovered by an 18th century vicar.

Bayesian theory in New Scientist « Reverendbayes’s Weblog

The quest to understand the most complex object in the known universe has been a long and fruitful one. These days we know a good deal about how the human brain works - how our senses translate into electrical signals, how different parts of the brain process these signals, how memories form and how muscles are controlled. We know which brain regions are active when we listen to speech, look at paintings or barter over money. We are even starting to understand the deeper neural processes behind learning and decision-making.

CVXMOD – Convex optimization software in Python

CVXMOD is a Python-based tool for expressing and solving convex optimization problems. It uses CVXOPT as its solver. It is developed by Jacob Mattingley, as PhD work under Stephen Boyd at Stanford University. CVXMOD is primarily a modeling layer for CVXOPT. While it is possible to use CVXOPT directly, CVXMOD makes it faster and easier to build and solve problems. Advanced users who want to see or manipulate how their problems are being solved should consider using CVXOPT directly. Additional features are being added to CVXMOD beyond just modeling. These are currently experimental. CVXMOD has a similar design philosophy to CVX, a convex optimization modeling language for Matlab®, and uses the principles of disciplined convex programming, as developed by Michael Grant, Stephen Boyd and Yinyu Ye.

March 2008

Don Quixote Time Series Software

Don Quixote is a new business software that uses artificial intelligence and powerful statistical methodology to achieve high forecasting accuracy. No matter if you forecast market shares, sales, profits, demand for services or material, Don Quixote will make your work faster, easier and more accurate and will improve your understanding of the nature of time series.

Linear Programming: Foundations and Extensions

# Balanced treatment of the simplex method and interior-point methods. # Efficient source code (in C) for all the algorithms presented in the text. # Thorough discussion of several interior-point methods including primal-dual path-following, affine-scaling, and homogeneous self dual methods. # Extensive coverage of applications including traditional topics such as network flows and game theory as well as less familiar ones such as structural optimization, L^1 regression, and the Markowitz portfolio optimization model. # Over 200 class-tested exercises. # A dynamically expanding collection of exercises.