public marks

PUBLIC MARKS from imelgrat with tags "PHP Classes" & feed

17 July 2007 16:15

Using Regular Expressions to Find RSS Links on a Page

the class performs three main steps. First, the cURL library is used to fetch the content pointed to by the URL the user passed.. Second, since PHP doesn’t have an SGML parser built in like Python seems to, so getting all the "link" tags has to be done manually. A few regular expressions and some simple string splitting made it all real easy. Last but not least, the function goes through all the links found, figures out which ones belong to RSS feeds, resolves them to absolutes URL if necessary, and stores them on an array, making sure the link isn’t already listed to prevent duplicate links (e.g., the RSS appears more than once in the page).