Setting datatype for a cell - ClosedXML/OpenXML
Published on

Setting datatype for a cell - ClosedXML/OpenXML

Authors

In this post, I will explain how to set the datatype for cells both in ClosedXML/OpenXML. ClosedXML is an open-source C# library that is built on OpenXML. It kind of faces the same challenges as the base library.

If you want a quick answer.

Use the NumberFormat property of Cell/Range instead of the DataType property to change the format shown in excel.

If you’re curious about NumberFormat and DataType, why those two solve different purposes in Excel, please browse through this post, where I’ll be experimenting and explaining in more detail.

Table of Contents

Setting Cell Data type with OpenXML

Here is a standard OpenXML code, where we assign a string “CodingJump” to Cell A1 and also try to change the data type to String/Text.

Cell cell = new Cell()
{
	CellReference = "A1", // Location of Cell
	DataType = CellValues.String,
	CellValue = new CellValue("CodingJump") // Setting Cell Value
};

Below is the full code for the above example.

using System;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
namespace CellTypeEx
{
    class Program
    {
        static void Main(string[] args)
        {
            string fileName = @"CellValueEx.xlsx";
            // Create a spreadsheet document by supplying the fileName.
            SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.
                Create(fileName, SpreadsheetDocumentType.Workbook);
            // Add a WorkbookPart to the document.
            WorkbookPart workbookpart = spreadsheetDocument.AddWorkbookPart();
            workbookpart.Workbook = new Workbook();
            // Add a WorksheetPart to the WorkbookPart.
            WorksheetPart worksheetPart = workbookpart.AddNewPart<WorksheetPart>();
            worksheetPart.Worksheet = new Worksheet(new SheetData());
            // Add Sheets to the Workbook.
            Sheets sheets = spreadsheetDocument.WorkbookPart.Workbook.AppendChild<Sheets>(new Sheets());
            // Append a new worksheet and associate it with the workbook.
            Sheet sheet = new Sheet()
            {
                Id = spreadsheetDocument.WorkbookPart.
                    GetIdOfPart(worksheetPart),
                SheetId = 1,
                Name = "mySheet"
            };
            sheets.Append(sheet);
            Worksheet worksheet = new Worksheet();
            SheetData sheetData = new SheetData();
            Row row = new Row() { RowIndex = 1U, Spans = new ListValue<StringValue>() };
            Cell cell = new Cell()
            {
                CellReference = "A1", // Location of Cell
                DataType = CellValues.String,
                CellValue = new CellValue("CodingJump") // Setting Cell Value
            };

            row.Append(cell);
            sheetData.Append(row);
            worksheet.Append(sheetData);
            worksheetPart.Worksheet = worksheet;
            workbookpart.Workbook.Save();
            // Close the document.
            spreadsheetDocument.Close();
        }
    }
}

But still when we open the excel file generated we see the text but we can see the data type did not change.

/static/images/posts/2021/closedxml-openxml-setting-data-type-for-cell/image3.png

So setting datatype to the cell doesn’t actually change Excel Data Format which is shown in Excel

Similarly now if we just change the cell data type to number and rerun the code it will still run.

Cell cell = new Cell()
{
	CellReference = "A1",
	DataType = CellValues.Number,
	CellValue = new CellValue("CodingJump")
};

But will create an error while opening saying Excel is corrupted.

/static/images/posts/2021/closedxml-openxml-setting-data-type-for-cell/image4.png

Click on Yes, then excel will try to recover and remove contents and it will open successfully.

/static/images/posts/2021/closedxml-openxml-setting-data-type-for-cell/image5.png

So this data type is being used internally in Excel to identify if the cell contains Text/Numeric value, etc… Will explain more on this in later sections.

Setting Cell Data type with ClosedXML

Below is the example code where we are writing “CodingJump” to Cell A1 using the ClosedXML library.

using System;
using System.Data;
using ClosedXML.Excel;
namespace CellTypeEx
{
	public class Program
	{
		public static void Main(string[] args)
		{
			var workbook = new XLWorkbook();
			workbook.AddWorksheet("Sheet1");
			var worksheet = workbook.Worksheet("Sheet1");
			worksheet.Cell(1, 1).SetDataType(XLDataType.Text);
			worksheet.Cell(1, 1).Value = "CodingJump";
			workbook.SaveAs("ClosedXmlCellValueEx.xlsx", false);
		}
	}
}

As you can see the code complexity has been drastically reduced, but this is a different topic that I’ve discussed in another blog post.

When you open this excel file which has been created by ClosedXML the results are EXACTLY THE SAME.

Now let’s try changing the data type to numeric similar to how we have done in the OpenXML example.

worksheet.Cell(1, 1).SetDataType(XLDataType.Number);
worksheet.Cell(1, 1).Value = "CodingJump";

After running the code we can see that it compiles successfully, and then when we try to open the excel we can see that there is no error shown. ClosedXML tries to avoid introducing errors when a data type mismatch occurs. So all sorts of safety mechanisms are already in place.

Applying data format to a cell the right way

If you want excel to show the exact data type you’re setting, you’ve to use the Cell NumberFormat string. If you were setting DataType, it would not show in Excel as you’d expect in a normal excel workflow.

Setting Cell datatype using OpenXML Numberformat Id

This time we can take a different example, let say we are going to put a number 1 million to Cell A using OpenXML.

CellFormats.Append(new CellFormat()
{
	BorderId = 0,
	FillId = 0,
	FontId = 0,
	NumberFormatId = 4, // Nu
	FormatId = 0,
	ApplyNumberFormat = true
});

Cell cell = new Cell()
{
	CellReference = "A1",
	StyleIndex = 0, // First User Formatting defined
	CellValue = new CellValue("1000000")
};

Also, the code for doing this in OpenXML can get quite complicated and cumbersome to manage, as we need to create a stylesheet and all other formats like border, etc..

using System;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
namespace CellTypeEx
{
    class Program
    {
		private static Stylesheet GetStylesheet()
		{
			var StyleSheet = new Stylesheet();
			// Create "fonts" node.
			var Fonts = new Fonts();
			Fonts.Append(new Font()
			{
				FontName = new FontName() { Val = "Calibri" },
				FontSize = new FontSize() { Val = 11 },
				FontFamilyNumbering = new FontFamilyNumbering() { Val = 2 },
			});
			Fonts.Count = (uint)Fonts.ChildElements.Count;
			// Create "fills" node.
			var Fills = new Fills();
			Fills.Append(new Fill()
			{
				PatternFill = new PatternFill() { PatternType = PatternValues.None }
			});
			Fills.Append(new Fill()
			{
				PatternFill = new PatternFill() { PatternType = PatternValues.Gray125 }
			});
			Fills.Count = (uint)Fills.ChildElements.Count;
			// Create "borders" node.
			var Borders = new Borders();
			Borders.Append(new Border()
			{
				LeftBorder = new LeftBorder(),
				RightBorder = new RightBorder(),
				TopBorder = new TopBorder(),
				BottomBorder = new BottomBorder(),
				DiagonalBorder = new DiagonalBorder()
			});
			Borders.Count = (uint)Borders.ChildElements.Count;
			// Create "cellStyleXfs" node.
			var CellStyleFormats = new CellStyleFormats();
			CellStyleFormats.Append(new CellFormat()
			{
				NumberFormatId = 0,
				FontId = 0,
				FillId = 0,
				BorderId = 0
			});
			CellStyleFormats.Count = (uint)CellStyleFormats.ChildElements.Count;
			// Create "cellXfs" node.
			var CellFormats = new CellFormats();
			// #,##0.00
			CellFormats.Append(new CellFormat()
			{
				BorderId = 0,
				FillId = 0,
				FontId = 0,
				NumberFormatId = 4,
				FormatId = 0,
				ApplyNumberFormat = true
			});
			CellFormats.Count = (uint)CellFormats.ChildElements.Count;
			// Create "cellStyles" node.
			var CellStyles = new CellStyles();
			CellStyles.Append(new CellStyle()
			{
				Name = "Normal",
				FormatId = 0,
				BuiltinId = 0
			});
			CellStyles.Count = (uint)CellStyles.ChildElements.Count;
			// Append all nodes in order.
			StyleSheet.Append(Fonts);
			StyleSheet.Append(Fills);
			StyleSheet.Append(Borders);
			StyleSheet.Append(CellStyleFormats);
			StyleSheet.Append(CellFormats);
			StyleSheet.Append(CellStyles);
			return StyleSheet;
		}

        static void Main(string[] args)
        {
            string fileName = @"CellValueEx.xlsx";
            // Create a spreadsheet document by supplying the fileName.
            SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.
                Create(fileName, SpreadsheetDocumentType.Workbook);
            // Add a WorkbookPart to the document.
            WorkbookPart workbookpart = spreadsheetDocument.AddWorkbookPart();
            workbookpart.Workbook = new Workbook();
            // Add a WorksheetPart to the WorkbookPart.
            WorksheetPart worksheetPart = workbookpart.AddNewPart<WorksheetPart>();
            worksheetPart.Worksheet = new Worksheet(new SheetData());

			// Add Stylesheet.
            var WorkbookStylesPart = workbookpart.AddNewPart<WorkbookStylesPart>();
            WorkbookStylesPart.Stylesheet = GetStylesheet();
            WorkbookStylesPart.Stylesheet.Save();
            // Add Sheets to the Workbook.
            Sheets sheets = spreadsheetDocument.WorkbookPart.Workbook.AppendChild<Sheets>(new Sheets());
            // Append a new worksheet and associate it with the workbook.
            Sheet sheet = new Sheet()
            {
                Id = spreadsheetDocument.WorkbookPart.
                    GetIdOfPart(worksheetPart),
                SheetId = 1,
                Name = "mySheet"
            };
            sheets.Append(sheet);
            Worksheet worksheet = new Worksheet();
            SheetData sheetData = new SheetData();

            Row row = new Row() { RowIndex = 1U, Spans = new ListValue<StringValue>() };
            Cell cell = new Cell()
            {
                CellReference = "A1",
                StyleIndex = 0,
                CellValue = new CellValue("1000000")
            };

            row.Append(cell);
            sheetData.Append(row);
            worksheet.Append(sheetData);
            worksheetPart.Worksheet = worksheet;
            workbookpart.Workbook.Save();
            // Close the document.
            spreadsheetDocument.Close();
        }
    }
}

Once we run this code, we successfully generate the excel file and it shows the Format as a number correctly and also it is displaying the number with commas as we had set NumberFormatId as 4.

If you’re not sure about OpenXML NumberFormatId you can look into this table.

/static/images/posts/2021/closedxml-openxml-setting-data-type-for-cell/image6.png

Setting Cell datatype using ClosedXML Numberformat Id

Now let’s move to ClosedXML’s example of how to do the same thing i.e. put 1 million to cell A1.

using System;
using System.Data;
using ClosedXML.Excel;
namespace CellTypeEx
{
	public class Program
	{
		public static void Main(string[] args)
		{
			var workbook = new XLWorkbook();
			workbook.AddWorksheet("Sheet1");
			var worksheet = workbook.Worksheet("Sheet1");
			worksheet.Cell(1, 1).SetDataType(XLDataType.Number);
			worksheet.Cell(1, 1).Value = 1000000;
			worksheet.Cell(1, 1).Style.NumberFormat.NumberFormatId = 4;
			workbook.SaveAs("ClosedXmlCellValueEx.xlsx", false);
		}
	}
}

As you can see from the above code, it’s very easy to set the datatype for a cell using NumberFormat, especially using ClosedXML 🙂

Setting datatype using NumberFormat String using ClosedXML

Until now I’ve been just using the number format id which is predefined by Microsoft, but we might have special requirements i.e. placing commas in a number, Displaying dates in a different format, etc.

For this we can make use of number format strings, this works the same way for ClosedXML/OpenXML.

Let’s take an example. We want to add CodingJump to cell A1, and also define it as a Text cell.

worksheet.Cell(1, 1).Value = "CodingJump";
worksheet.Cell(1, 1).Style.NumberFormat.Format = "@";

We can also do the same thing with NumberFormat id as Text is already defined as type 49.

worksheet.Cell(1, 1).Value = "CodingJump";
worksheet.Cell(1, 1).Style.NumberFormat.NumberFormatId = 49;

The real reason behind why you should use DataType property in OpenXML

Say you wanted to write the current DateTime to a cell, if you use the data type as String while writing it to excel, then the numeric value will be written as a string in a cell.

Cell cell = new Cell()
{
	CellReference = "A1", // Location of Cell
	DataType = CellValues.String,
	CellValue = new CellValue(DateTime.Now.ToOADate().ToString(CultureInfo.InvariantCulture))
};
/static/images/posts/2021/closedxml-openxml-setting-data-type-for-cell/image1.png

Even if you change number formats in Excel, it will not recognize & will show some error prompt in the cell as shown in the picture.

Now if we change the Data Type to Number, then it will work as expected.

/static/images/posts/2021/closedxml-openxml-setting-data-type-for-cell/image2.png
Cell cell = new Cell()
{
	CellReference = "A1", // Location of Cell
	DataType = CellValues.Number,
	CellValue = new CellValue(DateTime.Now.ToOADate().ToString(CultureInfo.InvariantCulture))
};

One more thing to note is that CellValues.Date is an older API and not recommended in later versions, it’s best to use CellValues.Number as it can hold date/time, double, etc.

ClosedXML, which is written on top of OpenXML, also works similarly. So we need to provide the data type correctly so that we don’t get any unexpected problems with conversions, etc.

Conclusion

In this post, we have covered how to set & use number format in various ways and when to make use of data type, how it impacts the value that is being written/read to excel.

So the conclusion is that DataType is an internal property that lets us know data type, whereas NumberFormat is mainly used for display purposes, etc. Numbers can be displayed in a variety of ways in Excel.

It can get really confusing sometimes with how the APIs are named and their actual functionality, but it’s always good to know the differences between them. So that you can tackle them when you find these in your Excel Journeys 🙂