Programmatic generation of word documents

I had the opportunity to play around with generating word documents last week. At first I looked at what I had used before (the Office PIA).

I quickly remembered that it was not very easy to work with because most of it is based on COM interop anyways. Further-more, I stumbled across a little MSDN article that stated that the Office PIA should not be used to generate documents from a server; it suggested that it only be used in desktop application environments where a user is controlling the application itself. I imagine that is because the Office PIA really only invokes the Word application and tells the Word application what to do in real-time (or something like that)… and that makes sense to me. Can you imagine a web server that gets even 100 users spawning each of their own processes of word.exe to generate a doc?! Production… Nightmare…

Trying to think of some alternatives, I stumbled across Microsoft’s Open XML API. What a wonderful thing. Sure, it’s not generating documents in the well-established .doc format from 97, but then again, that format was established in 97. Isn’t it about time we move on? Newer versions of Office even default to .docx formats. I used to reset the defaults back to .doc, but eventually gave up and started paying attention to who the audience of my documents was before I hit the save button; something a good writer should do anyways.

In any event: After deciding that it was OK to generate in .docx format and then convert to .doc (only if needed), I played around with the API a bit and was astounded with how much easier it was to create a word document with the Open XML API rather than the Office PIA. Take a look at these little examples:

Creating a document package:

using (MemoryStream docStream = new MemoryStream())
{
    WordprocessingDocument document = WordprocessingDocument.Create(
        this.docStream,
        DocumentFormat.OpenXml.WordprocessingDocumentType.Document);
    document.AddMainDocumentPart();

    // A method I use to setup the CSS (sorta) styles for the document
    SetupStyles(this.document.MainDocumentPart.AddNewPart());

    document.MainDocumentPart.Document =
        new Document(
            new Body());

    // DO STUFF TO THE DOC HERE

    document.Close();

    // Read the memory stream to get the binary contents of the document
}

Add a header line:

// Add a Header1
Paragraph p1 = new Paragraph(
    new ParagraphProperties(
        new ParagraphStyleId() { Val = "Heading1" }),
    new Run(
        new Text("This is a test header")));
document.MainDocumentPart.Document.Body.AppendChild(p1);

Create a table:

// Add a table
Table t = new Table();

TableRow r1 = new TableRow(
    new TableCell(
        new Paragraph(
            new Run(
                new Text("Row1-Cell1")))),
    new TableCell(
        new Paragraph(
            new Run(
                new Text("Row1-Cell2")))));
TableRow r2 = new TableRow(
    new TableCell(
        new Paragraph(
            new Run(
                new Text("Row2-Cell1")))),
    new TableCell(
        new Paragraph(
            new Run(
                new Text("Row2-Cell2")))));

t.AppendChild(r1);
t.AppendChild(r2);

document.MainDocumentPart.Document.Body.AppendChild(t);

Those are only three of snippets I was able to come up with in a matter of a couple hours. I have other examples of adding images, creating internal hyper-links, creating external hyperlinks (all of which are about equally as easy).

The only down-side I can see so far to the MS Open XML API is that it doesn’t seem very well documented. I found better examples contributed by joe-nobody’s like myself than I did from Microsoft. Sure, they have class documentation, but that doesn’t compare to examples when dealing with a complex API like this.

I hope I get to work with the API more soon, cause it seems fun!

Posted in: Uncategorized

Leave a Reply

Your email address will not be published.

Humanity Verification *Captcha loading...