PUBLIC   marks

PUBLIC MARKS with tag unicode

Sponsorised links

This month

Python, Unicode and UnicodeDecodeError

by marco
Unicode ---- encode ----> ASCII ASCII ---- decode ----> Unicode

October 2009

Sponsorised links

September 2009

August 2009

Character encoding detection for external scripts

by karlcow

This is (EF BB BF) C3 B6 3D 22 21 22 loaded into browsers under various labels. That happens to be properly formed ECMAScript code for all the encodings used. The bogus results for Opera9 can easily be reproduced in context of the testing script, but probably not individually from a clean cache; what's going on there is unknown. I also noted in running these tests that Opera claims "Opera supports the entire ECMA-262 2nd and 3rd standards with no exceptions" while in fact their implementation does not, the parser rejects code that follows the IdentifierStart :: UnicodeEscapeSequence production of ECMA-262 section 7.6. Instead it implements Opera-only extensions, like comma-free arrays ala [ 1 2 3 ]. Other fun facts include: IE does not implement onload for iframes and cannot modify the innerHTML or tr elements; Firefox ignores "tags" when setting the innerHTML of dynamically created tr elements with no ownerElement... Oh and Opera again needs /th "tags" so it won't nest adjacent th elements when setting innerHTML.

Flickr - Sauvegarde/export globale des données(métadonnées) - Wiki URFIST

by decembre
Flickr présente l'inconvénient majeur de ne pas permettre une sauvegarde/export globale des données, alors même que Exporter les images est facile, mais le problème principal demeure: comment sauvegarder les métadonnées ajoutées aux photographies. Conserver les métadonnées dans les images même et non dans un fichier externe. C'est le sens de la règle d'or posée par Erwyn Van Der Meer, l'auteur de l'outil Flickr Metadata Synchr, listé plus haut, dans un article où il fait un point très clair et synthétique sur les formats et la façon dont Flickr les utilise (ou pas): "Store the metadata in your images!" De quelles métadonnées parle-t-on? * celles enregistrées par l'appareil (digital) au moment où la photographie est prise (informations techniques); format de référence: EXIF * celles ajoutées manuellement (en particulier les mots-clés / tags); formats: IPTC (plus) et XMP(plus), le second plus récent et traitant l'unicode.

BabelStone : Software : BabelPad (Unicode Text Editor for Windows)

by parmentierf (via)
BabelPad is a free Unicode text editor for Windows that supports the proper rendering of most complex scripts, and allows you to assign different fonts to different scripts in order to facilitate multi-script text editing.

July 2009

June 2009

May 2009

March 2009

Understanding Bidirectional (BIDI) Text in Unicode

by Spone
A little-understood corner of Unicode is its handling for bidirectional text (The spec is a little dry). While English languages are read left-to-right, plenty of scripts (notably Arabic and Hebrew) are read from right to left. When only a single direction of text is used in a document, it's fairly straight forward, but when texts with different directions are mixed in one document, some difficulty arises in determining direction. This document attempts to explain how bidirectional text in Unicode works and what this means for the web.

January 2009

URDU Unicode Utility - About

by parmentierf
The Unicode Relational Database Utility (URDU) is a database comprised of character set data of interest to libraries.

December 2008

The Voidspace Techie Blog

by karlcow

If we are only ever using this file on the current system then maybe we don't need to worry, but if we ever need to read data that might have been created on another system then we had better know what encoding was used. The solution: the open function takes an optional encoding parameter:

Blog Stéphane Bortzmeyer: Mon exposé à Sparkling Point sur les conséquences politiques des choix techniques

by CharlesNepote (via)

un exposé sur « Les conséquences politiques des choix techniques », suivant l'observation de Lawrence Lessig que « The code is the law » (l'architecture - d'un système technique - est la loi).

Je développe trois exemples, empruntés au monde de l'Internet, les politiques d'allocations d'adresses IP, la révision de la norme sur les noms de domaines en Unicode et les débats sur la nouvelle architecture de routage et d'adressage de l'Internet.

November 2008

Unicode HOWTO

by parmentierf & 1 other
This HOWTO discusses Python's support for Unicode, and explains various problems that people commonly encounter when trying to work with Unicode.

October 2008

PUBLIC TAGS related to tag unicode

no tag

Sponsorised links