Overview
This implementation uses the NPOI library to extract data from Excel files (.xlsx). Excel cells can contain various types such as numbers, strings, dates, or formulas. This article provides a class that encapsulates the logic to correctly determine these types and handle them as native C# types.
Specifications (Input/Output)
- Input: File path, sheet index, and row/column numbers.
- Output: Cell value converted to the appropriate type (object).
- Dependencies: NPOI (NuGet package).
Implemented Methods
| Method Name | Description |
| Open | Reads the Excel file at the specified path and returns a workbook object. |
| SelectSheet | Loads the worksheet into memory using a 0-based index. |
| GetValue | Accesses a specific cell and retrieves its value. Returns null if the row or cell is undefined. |
| _CellValue | Internal logic that determines the cell type (CellType) and branches into numeric, string, date, boolean, or formula processing. |
Basic Usage
// Open the file
var book = MyExcelBook.Open("data.xlsx");
// Select the first sheet
book.SelectSheet(0);
// Get the value of the 3rd row, 2nd column (B3)
var value = book.GetValue(2, 1);
Console.WriteLine($"Value: {value}");
Full Code Example
using System;
using System.IO;
using NPOI.SS.UserModel;
using NPOI.XSSF.UserModel;
class Program
{
static void Main()
{
string path = "example.xlsx";
if (!File.Exists(path))
{
Console.WriteLine("Target file not found.");
return;
}
try
{
// 1. Open Excel file
var xls = MyExcelBook.Open(path);
// 2. Select sheet (0)
xls.SelectSheet(0);
// 3. Test value retrieval (e.g., Column A, Rows 0 to 4)
Console.WriteLine("--- Cell Values ---");
for (int i = 0; i < 5; i++)
{
// GetValue(row, col)
var val = xls.GetValue(i, 0);
Console.WriteLine($"Row {i}: {val?.ToString() ?? "(empty)"}");
}
}
catch (Exception ex)
{
Console.WriteLine($"Error: {ex.Message}");
}
}
}
/// <summary>
/// Wrapper class for reading Excel using NPOI
/// </summary>
public sealed class MyExcelBook
{
private XSSFWorkbook _workbook;
private ISheet _sheet;
private MyExcelBook()
{
}
/// <summary>
/// Opens an Excel file in read mode and creates an instance.
/// </summary>
public static MyExcelBook Open(string filePath)
{
var obj = new MyExcelBook();
using var stream = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
obj._workbook = new XSSFWorkbook(stream);
return obj;
}
/// <summary>
/// Selects the target sheet by index.
/// </summary>
public void SelectSheet(int index)
{
_sheet = _workbook.GetSheetAt(index);
}
/// <summary>
/// Gets the cell value at the specified row and column.
/// </summary>
public object GetValue(int rowIndex, int colIndex)
{
if (_sheet == null) return null;
var row = _sheet.GetRow(rowIndex);
if (row == null) return null;
var cell = row.GetCell(colIndex);
if (cell == null) return null;
return _CellValue(cell, cell.CellType);
}
/// <summary>
/// Internal logic to extract the appropriate value based on CellType
/// </summary>
private object _CellValue(ICell cell, CellType type)
{
var actualType = (type == CellType.Unknown) ? cell.CellType : type;
switch (actualType)
{
case CellType.String:
return cell.StringCellValue;
case CellType.Boolean:
return cell.BooleanCellValue;
case CellType.Numeric:
// Determine if it is a date format
if (DateUtil.IsCellDateFormatted(cell))
{
return cell.DateCellValue; // DateTime type
}
return cell.NumericCellValue; // double type
case CellType.Formula:
// For formulas, recursively get the value based on the cached result type
return _CellValue(cell, cell.CachedFormulaResultType);
case CellType.Blank:
return string.Empty;
case CellType.Error:
return $"ERROR_CODE:{cell.ErrorCellValue}";
default:
return null;
}
}
}
Customization Points
- Unify Return Types: This implementation returns an
object. If casting is difficult on the caller side, you can modifyGetValueto call.ToString()and return everything as astring. - Specify Sheet Names: You can extend
SelectSheetto allow selecting by sheet name using_workbook.GetSheet("SheetName"). - Enhanced Error Handling: Improve reliability by adding specific handling for
IOException(e.g., when the file is open in another program) within theOpenmethod.
Important Notes
- Null Safety for Rows and Cells: In NPOI, rows or cells that have no data are null. Failing to check for null on
GetRoworGetCellresults in aNullReferenceException. - Internal Representation of Dates: Excel stores dates as numeric serial values. Without the
DateUtil.IsCellDateFormattedcheck, a date will be retrieved as a floating-point number like 45231.5. - Formula Calculation: This code retrieves the “cached result” from the last time Excel saved the file. If you modify cell values within NPOI, the formula results will not update unless you explicitly execute a recalculation process.
Advanced Usage
Text Dump of All Cell Data
The following is an example of an extension to retrieve all data within the effective range of a sheet as strings.
public void DumpSheetData()
{
if (_sheet == null) return;
for (int i = _sheet.FirstRowNum; i <= _sheet.LastRowNum; i++)
{
var row = _sheet.GetRow(i);
if (row == null) continue;
for (int j = row.FirstCellNum; j < row.LastCellNum; j++)
{
var cell = row.GetCell(j);
var val = (cell == null) ? "" : _CellValue(cell, cell.CellType);
Console.Write($"{val}\t");
}
Console.WriteLine();
}
}
Conclusion
When retrieving cell values with NPOI, strict type determination based on CellType and defensive programming against null objects are required. By encapsulating this logic within a class, the calling side can safely utilize data without being concerned with the internal specifications of Excel.
