[C#] How to Read Excel Cells and Implement Type Determination Logic with NPOI

目次

Overview

This implementation uses the NPOI library to extract data from Excel files (.xlsx). Excel cells can contain various types such as numbers, strings, dates, or formulas. This article provides a class that encapsulates the logic to correctly determine these types and handle them as native C# types.

Specifications (Input/Output)

  • Input: File path, sheet index, and row/column numbers.
  • Output: Cell value converted to the appropriate type (object).
  • Dependencies: NPOI (NuGet package).

Implemented Methods

Method NameDescription
OpenReads the Excel file at the specified path and returns a workbook object.
SelectSheetLoads the worksheet into memory using a 0-based index.
GetValueAccesses a specific cell and retrieves its value. Returns null if the row or cell is undefined.
_CellValueInternal logic that determines the cell type (CellType) and branches into numeric, string, date, boolean, or formula processing.

Basic Usage

// Open the file
var book = MyExcelBook.Open("data.xlsx");

// Select the first sheet
book.SelectSheet(0);

// Get the value of the 3rd row, 2nd column (B3)
var value = book.GetValue(2, 1);

Console.WriteLine($"Value: {value}");

Full Code Example

using System;
using System.IO;
using NPOI.SS.UserModel;
using NPOI.XSSF.UserModel;

class Program
{
    static void Main()
    {
        string path = "example.xlsx";

        if (!File.Exists(path))
        {
            Console.WriteLine("Target file not found.");
            return;
        }

        try
        {
            // 1. Open Excel file
            var xls = MyExcelBook.Open(path);

            // 2. Select sheet (0)
            xls.SelectSheet(0);

            // 3. Test value retrieval (e.g., Column A, Rows 0 to 4)
            Console.WriteLine("--- Cell Values ---");
            for (int i = 0; i < 5; i++)
            {
                // GetValue(row, col)
                var val = xls.GetValue(i, 0);
                
                Console.WriteLine($"Row {i}: {val?.ToString() ?? "(empty)"}");
            }
        }
        catch (Exception ex)
        {
            Console.WriteLine($"Error: {ex.Message}");
        }
    }
}

/// <summary>
/// Wrapper class for reading Excel using NPOI
/// </summary>
public sealed class MyExcelBook
{
    private XSSFWorkbook _workbook;
    private ISheet _sheet;
    
    private MyExcelBook() 
    {
    }

    /// <summary>
    /// Opens an Excel file in read mode and creates an instance.
    /// </summary>
    public static MyExcelBook Open(string filePath)
    {
        var obj = new MyExcelBook();
        
        using var stream = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
        obj._workbook = new XSSFWorkbook(stream);
        
        return obj;
    }

    /// <summary>
    /// Selects the target sheet by index.
    /// </summary>
    public void SelectSheet(int index)
    {
        _sheet = _workbook.GetSheetAt(index);
    }

    /// <summary>
    /// Gets the cell value at the specified row and column.
    /// </summary>
    public object GetValue(int rowIndex, int colIndex)
    {
        if (_sheet == null) return null;

        var row = _sheet.GetRow(rowIndex);
        if (row == null) return null;

        var cell = row.GetCell(colIndex);
        if (cell == null) return null;

        return _CellValue(cell, cell.CellType);
    }

    /// <summary>
    /// Internal logic to extract the appropriate value based on CellType
    /// </summary>
    private object _CellValue(ICell cell, CellType type)
    {
        var actualType = (type == CellType.Unknown) ? cell.CellType : type;

        switch (actualType)
        {
            case CellType.String:
                return cell.StringCellValue;

            case CellType.Boolean:
                return cell.BooleanCellValue;

            case CellType.Numeric:
                // Determine if it is a date format
                if (DateUtil.IsCellDateFormatted(cell))
                {
                    return cell.DateCellValue; // DateTime type
                }
                return cell.NumericCellValue; // double type

            case CellType.Formula:
                // For formulas, recursively get the value based on the cached result type
                return _CellValue(cell, cell.CachedFormulaResultType);

            case CellType.Blank:
                return string.Empty;

            case CellType.Error:
                return $"ERROR_CODE:{cell.ErrorCellValue}";

            default:
                return null;
        }
    }
}

Customization Points

  • Unify Return Types: This implementation returns an object. If casting is difficult on the caller side, you can modify GetValue to call .ToString() and return everything as a string.
  • Specify Sheet Names: You can extend SelectSheet to allow selecting by sheet name using _workbook.GetSheet("SheetName").
  • Enhanced Error Handling: Improve reliability by adding specific handling for IOException (e.g., when the file is open in another program) within the Open method.

Important Notes

  • Null Safety for Rows and Cells: In NPOI, rows or cells that have no data are null. Failing to check for null on GetRow or GetCell results in a NullReferenceException.
  • Internal Representation of Dates: Excel stores dates as numeric serial values. Without the DateUtil.IsCellDateFormatted check, a date will be retrieved as a floating-point number like 45231.5.
  • Formula Calculation: This code retrieves the “cached result” from the last time Excel saved the file. If you modify cell values within NPOI, the formula results will not update unless you explicitly execute a recalculation process.

Advanced Usage

Text Dump of All Cell Data

The following is an example of an extension to retrieve all data within the effective range of a sheet as strings.

public void DumpSheetData()
{
    if (_sheet == null) return;

    for (int i = _sheet.FirstRowNum; i <= _sheet.LastRowNum; i++)
    {
        var row = _sheet.GetRow(i);
        if (row == null) continue;

        for (int j = row.FirstCellNum; j < row.LastCellNum; j++)
        {
            var cell = row.GetCell(j);
            var val = (cell == null) ? "" : _CellValue(cell, cell.CellType);
            Console.Write($"{val}\t");
        }
        Console.WriteLine();
    }
}

Conclusion

When retrieving cell values with NPOI, strict type determination based on CellType and defensive programming against null objects are required. By encapsulating this logic within a class, the calling side can safely utilize data without being concerned with the internal specifications of Excel.

よかったらシェアしてね!
  • URLをコピーしました!
  • URLをコピーしました!

この記事を書いた人

私が勉強したこと、実践したこと、してることを書いているブログです。
主に資産運用について書いていたのですが、
最近はプログラミングに興味があるので、今はそればっかりです。

目次