Some background first...
I had the need to do diff of html content in a project I was working on, which brought me pretty quickly to DaisyDiff, a really nice Java-based utility. DaisyDiff doesn't however, have a simple built-in function to do a diff of two strings. There is a command-line option, which takes the paths of two files as arguments, and also a java api that take a number of java objects as arguments. What I wanted was a function that took two strings and output the results, but DaisyDiff has no such simple function.
I don't really do java development -- that is I've done some in the past but it's been a while and it would probably take me some time to get my development environment up to snuff. Besides, I didn't really feel like dealing with compiled code.
A quick google search, of course, turns up CFX_CompareHTML and the JavaLoader version of the same thing. So I used that, and it worked fine. But it was using an old version of DaisyDiff, and it seemed to have some bugs with UTF characters and such. What I really wanted to do was to use JavaLoader to load the current version of DaisyDiff. After much stumbling around in the code, I found that the test suite in the DaisyDiff repository has exactly the function I wanted -- it compares two strings and returns the result.
So, long story short, I took the code from that function and pulled it into a CFC, using JavaLoader, and rewrote everything in CFML. The result is the simple function I was after.
So anyway, here it is:
<cfcomponent hint="Wrapper for DaisyDiff" output="false">
<cffunction name="Init" output="false" returntype="DaisyDiff">
<cfargument name="daisydiffpath" hint="absolute path to daisydiff jar file" type="string" required="true">
<cfargument name="javaloaderpath" hint="component path to JavaLoader.cfc" type="string" required="true">
<cfset This.daisydiffpath = arguments.daisydiffpath>
<cfset This.javaloaderpath = arguments.javaloaderpath>
<cfreturn This>
</cffunction>
<cffunction name="Diff" output="false" returntype="string">
<cfargument name="olderHtml" type="string" required="true">
<cfargument name="newerHtml" type="string" required="true">
<cfset var paths = [This.daisydiffpath]>
<cfset var loader = createObject("component", This.javaloaderpath).init(paths)>
<cfset var TransformerFactoryImpl = loader.create("com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl")>
<cfset var StringReader = loader.create("java.io.StringReader")>
<cfset var StringWriter = loader.create("java.io.StringWriter")>
<cfset var Locale = loader.create("java.util.Locale")>
<cfset var StreamResult = loader.create("javax.xml.transform.stream.StreamResult")>
<cfset var OutputKeys = loader.create("javax.xml.transform.OutputKeys")>
<cfset var NekoHtmlParser = loader.create("org.outerj.daisy.diff.helper.NekoHtmlParser")>
<cfset var DomTreeBuilder = loader.create("org.outerj.daisy.diff.html.dom.DomTreeBuilder")>
<cfset var HTMLDiffer = loader.create("org.outerj.daisy.diff.html.HTMLDiffer")>
<cfset var HtmlSaxDiffOutput = loader.create("org.outerj.daisy.diff.html.HtmlSaxDiffOutput")>
<cfset var TextNodeComparator = loader.create("org.outerj.daisy.diff.html.TextNodeComparator")>
<cfset var InputSource = loader.create("org.xml.sax.InputSource")>
<cfset var finalResult = StringWriter.Init()>
<cfset var result = TransformerFactoryImpl.Init().newTransformerHandler()>
<cfset var sr = StreamResult.Init(finalResult)>
<cfset var prefix = "diff">
<cfset var cleaner = NekoHtmlParser.Init()>
<cfset var oldSource = InputSource.Init(StringReader.Init(olderHtml))>
<cfset var newSource = InputSource.Init(StringReader.Init(newerHtml))>
<cfset var oldHandler = DomTreeBuilder.Init()>
<cfset var newHandler = DomTreeBuilder.Init()>
<cfset var leftComparator = "">
<cfset var rightComparator = "">
<cfset var output = "">
<cfset var differ = "">
<cfset var diff = "">
<cfset result.setResult(sr)>
<cfset result.getTransformer().setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes")>
<cfset cleaner.parse(oldSource, oldHandler)>
<cfset leftComparator = TextNodeComparator.Init(oldHandler, Locale.getDefault())>
<cfset cleaner.parse(newSource, newHandler)>
<cfset rightComparator = TextNodeComparator.Init(newHandler, Locale.getDefault())>
<cfset output = HtmlSaxDiffOutput.Init(result,prefix)>
<cfset differ = HTMLDiffer.Init(output)>
<cfset differ.diff(leftComparator, rightComparator)>
<cfset diff = finalResult.toString()>
<cfreturn diff>
</cffunction>
</cfcomponent>
Usage:
<cfset var daisy = CreateObject("component","cfc.DaisyDiff").Init(expandPath("../daisydiff-1.1/daisydiff.jar"),"Lighthouse.Utilities.javaloader.JavaLoader")>
<cfset var diff = daisy.diff(olderhtml,newerhtml)>
The result is html that has been marked up by DaisyDiff with special classes. You can take that and style it in any way that you see fit.
I'm sure there are some refinements that could be done to this CFC. The class name prefix, for instance, is hardcoded to "diff", and that could be changed if you need to use a different prefix. Someone more familiar with the Java classes used here could find problems too, which I would welcome.
Posted on March 29, 2010 4:37:47 PM EDT by David Hammond
What do you think? I'm happy to help with the github bit :)
Posted on July 9, 2010 8:42:02 AM EDT by David Boyer
Posted on July 13, 2010 9:48:16 AM EDT by David Hammond
thanks
Posted on October 5, 2012 9:16:12 AM EDT by manoj kumar
Comments have been disabled for this page.