Boris Eetgerink

September 26, 2022

I love the composition of the C# Open XML SDK

Recently I had to generate an export with some basic data. Unfortunately, CSV was not an option, a proper Excel file was required. There are quite a few options for generating Excel files, but most are wrappers around the Open XML SDK, and I don't really like that. You invest in a specific library and eventually it just misses that one feature you need. So I decided to use the SDK directly, but it is difficult to understand at first, documentation is sparse and the Productivity Tool is unsupported and generates a lot of unnecessary code. But, after diving into it, I started to see some patterns:

  • Each sheet has a definition class and a data class which are linked with an Id string.
  • On the one hand Excel is very picky: elements have to be inserted in a specific order.
  • On the other hand Excel is also quite lenient in that not all elements generated by the productivity tool are required to create a valid document.
  • Almost all classes have constructors that accept any number of child elements, allowing you to compose a document without assigning variables all the time.

Constructor overloads

Especially the constructor overloads pattern makes working with the SDK very powerful, so I'd like to dive a little deeper into that. Most classes in the SDK can have child elements, but some may not. The classes that are allowed to have child elements all have 4 constructors. Take a look at the Row class, for example:

public class Row : TypedOpenXmlCompositeElement
{
    public Row() : base() {}
    public Row(IEnumerable<OpenXmlElement> childElements) : base(childElements) {}
    public Row(params OpenXmlElement[] childElements) : base(childElements) {}
    public Row(string outerXml) : base(outerXml) {}
}

And this is the Workbook class and its constructors:

public class Workbook : TypedOpenXmlPartRootElement
{
    public Workbook() : base() {}
    public Workbook(IEnumerable<OpenXmlElement> childElements) : base(childElements) {}
    public Workbook(params OpenXmlElement[] childElements) : base(childElements) {}
    public Workbook(string outerXml) : base(outerXml) {}
}

What makes the constructors of these classes so powerful is the combination of the IEnumerable overload and the params overload. The params overload works best if you know the number of child elements in advance, and the IEnumerable overload works best if you don't. You can also freely combine the two. Here I know Worksheet has two child elements, but those each have a variable number of child elements of their own:

workbookPart.AddNewPart<WorksheetPart>($"Sheet{i}").Worksheet = new Worksheet(
    new Columns(GetColumns()),
    new SheetData(GetHeaderAndDataRowsForSheet(sheet.Value)));

If the params overload wasn't present, the code would look like this. Not terrible, but certainly harder to read:

workbookPart.AddNewPart<WorksheetPart>($"Sheet{i}").Worksheet = new Worksheet(new OpenXmlElement[] {
    new Columns(GetColumns()),
    new SheetData(GetHeaderAndDataRowsForSheet(sheet.Value))
});

Inheritance chain and base constructor

Worth mentioning is that the constructor overloads accept child elements of the type OpenXmlElement. That's the abstract base class that every class in the SDK extends from. The inheritance chain for the Row class is:

Row : TypedOpenXmlCompositeElement : OpenXmlCompositeElement : OpenXmlElement

And for the Workbook class it is:

Workbook : TypedOpenXmlPartRootElement : OpenXmlPartRootElement : OpenXmlCompositeElement : OpenXmlElement

OpenXmlCompositeElement is the abstract class where the constructor is implemented:

protected OpenXmlCompositeElement(IEnumerable<OpenXmlElement> childrenElements)
    : this() { /* ... */ }

It only has the IEnumerable<OpenXmlElement> parameter overload, not the overload that accepts an array. That is because an array also implements IEnumerable, so the params overload is only a convenience overload for the users of the SDK!

I want to use this in my own code

I want to use this in my own code, when it makes sense to do so. Especially if I know a constructor will be called with a constant number of parameters. I also want to support the IEnumerable overload and not duplicate constructor implementations. Here's how to do it:

public class Foo
{
    public Foo(params Bar[] bars) : this(bars.AsEnumerable()) {}
    public Foo(IEnumerable<Bar> bars) { /* ... */ }
}

Notice that I have to call the IEnumerable constructor overload with the AsEnumerable() extension method, otherwise the params constructor will try to call itself, which is not allowed. Using the AsEnumerable() extension method is no problem, as it doesn't allocate any more memory, it just generalizes the type.

Conclusion

If I have a constructor in my own code that accepts multiple elements with an IEnumerable I will certainly add an overload with the params keyword if it makes sense. It makes the code more composable and easier to read.