I've been working on improving the RSS feeds generated by Lighthouse. One persistent problem is that some RSS readers (IE, for instance) will choke on a lot of special characters, such as those pasted from Microsoft Word. I have previously put in place code to replace many of those characters, but came across another one today that wasn't replaced. I knew I needed a better solution.
On CFlib.org, I found a function called xmlFormat2 that smartly avoids maintaining a list of characters to replace, and just replaces all characters not in a list of "good" characters. That makes sense. And it works.
I was concerned, though, about the performance of the function, and I thought it could be done better. Using the REMatch function (introduced in ColdFusion 8), I was able to make the function both simpler and much faster. My tests so far have been limited, but it has so far handled everything I have thrown at it. And here it is:
<cffunction name="XmlSafeText" hint="Replaces all characters that would break an xml file." returnType="string" output="false">
<cfargument name="txt" hint="String to format" type="string" required="true">
<cfset var chars = "">
<cfset var replaced = "">
<!--- Use XmlFormat function first --->
<cfset txt = XmlFormat(txt)>
<!--- Get all other characters to replace. --->
<cfset chars = REMatch("[^[:ascii:]]",txt)>
<!--- Loop through characters and do replace. Maintain a list of characters already replaced to avoid duplicate work. --->
<cfloop index="char" array="#chars#">
<cfif ListFind(replaced,char) is 0>
<cfset txt = Replace(txt,char,"&##" & asc(char) & ";","all")>
<cfset replaced = ListAppend(replaced,char)>
</cfif>
</cfloop>
<cfreturn txt>
</cffunction>
It should be possible to use it as a replacement for the built in XmlFormat function. Let me know if you run into any problems with it.
Posted on July 6, 2009 7:08:07 PM EDT by David Hammond
Posted on April 13, 2011 1:34:50 PM EDT by Eric B
Posted on April 13, 2011 1:53:07 PM EDT by David Hammond
Comments have been disabled for this page.