Integrate OpenOffice with Java without Installing OpenOffice
Until a few days ago, I've always needed to work with the rather cumbersome Office Bean and UNO Runtime when integrating OpenOffice into a Java application. I also had to configure a whole bunch of things to force OpenOffice to play nicely with the Java integration. Two days ago, however, I found out about ODF Toolkit. It seems to be a relatively new project, independent since last year some time, though I could be wrong.
What's especially interesting is the ODFDOM: ''ODFDOM is an OpenDocument (ODF) framework. It's purpose is to provide an easy common way to create, access and manipulate ODF files, without requiring detailed knowledge of the ODF specification. It is designed to provide the ODF developer community an easy lightwork programming API, portable to any object-oriented language.''
Here's a snippet of it in action:
public static void main(String[] args) {
try {
OdfDocument odfDoc = OdfDocument.loadDocument(new File("/home/geertjan/test.ods"));
OdfFileDom odfContent = odfDoc.getContentDom();
XPath xpath = odfDoc.getXPath();
DTMNodeList nodeList = (DTMNodeList) xpath.evaluate("//table:table-row/table:table-cell[1]", odfContent, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); i++) {
Node cell = nodeList.item(i);
if (!cell.getTextContent().isEmpty()) {
System.out.println(cell.getTextContent());
}
}
} catch (Exception ex) {
//Handle...
}
}
Let's assume that the 'test.ods' file above has this content:
From the above, the code listing would print the following:
Cuthbert
Algernon
Wilbert
And, as a second example, here's me reading the first paragraph of an OpenOffice Text document:
public static void main(String[] args) {
try {
OdfDocument odfDoc = OdfDocument.loadDocument(new File("/home/geertjan/chapter2.odt"));
OdfFileDom odfContent = odfDoc.getContentDom();
XPath xpath = odfDoc.getXPath();
OdfParagraphElement para = (OdfParagraphElement) xpath.evaluate("//text:p[1]", odfContent, XPathConstants.NODE);
System.out.println(para.getFirstChild().getNodeValue());
} catch (Exception ex) {
//Handle...
}
}
On my classpath I have "odfdom.jar" and "xerces-2.8.0.jar". I don't necessarily have OpenOffice installed, which means I can very easily process a whole bunch of spreadsheets (or other OpenOffice output) without (a) installing OpenOffice and (b) faster than I would otherwise do, since OpenOffice doesn't need to be started up, via the Office Bean or otherwise. In fact, Aljoscha Rittner from Sepix, who told me about this project and who is using it in his commercial applications, reports that his processing has sped up to a fraction of the original, also because he doesn't need to handle the situation where OpenOffice would crash randomly in the middle of long running processes, such as during the night when there's no human interaction for restarting it.