TabluarData DataFrame removing row results in EXC_BAD_ACCESS

I am working with data in Swift using the TabularData framework. I load data from a CSV file into a DataFrame, then copy the data into a second DataFrame, and finally remove a row from the second DataFrame.

The problem arises when I try to remove a row from the second DataFrame, at which point I receive an EXC_BAD_ACCESS error. However, if I modify the "timings" column (the final column) before removing the row (even to an identical value), the code runs without errors.

Interestingly, this issue only occurs when a row in the column of the CSV file contains more than 15 characters.

This is the code I'm using:

func loadCSV() {
    let documentsDirectory = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask).first!
    let url = documentsDirectory.appendingPathComponent("example.csv")

    var dataframe: DataFrame
    do {
        dataframe = try .init(
            contentsOfCSVFile: url,
            columns: ["user", "filename", "syllable count", "timings"],
            types: ["user": .string, "filename": .string, "syllable count": .integer, "timings": .string]
        )
    } catch {
        fatalError("Failed to load csv data")
    }

    print("First data frame",dataframe, separator: "\n") /// This works

    var secondFrame = DataFrame()
    secondFrame.append(column: Column<String>(name: "user", capacity: 1000))
    secondFrame.append(column: Column<String>(name: "filename", capacity: 1000))
    secondFrame.append(column: Column<Int>(name: "syllable count", capacity: 1000))
    secondFrame.append(column: Column<String>(name: "timings", capacity: 1000))

    for row in 0..<dataframe.rows.count {
        secondFrame.appendEmptyRow()
        for col in 0..<4 {
            secondFrame.rows[row][col] = dataframe.rows[row][col]
        }
    }
//            secondFrame.rows[row][3, String.self] = String("0123456789ABCDEF") /* If we include this line, it will not crash, even though the content is the same */

    print("Second data frame before removing row",dataframe, separator: "\n") // Before removal
    secondFrame.removeRow(at: 0)
    print("Second data frame after removing row",dataframe, separator: "\n") // After removal—we will get Thread 1: EXC_BAD_ACCESS here. The line will still print, however
}

and the csv (minimal example):

user,filename,syllable count,timings
john,john-001,12,0123456789ABCDEF
jane,jane-001,10,0123456789ABCDE

I've been able to replicate this bug on macOS and iOS using minimal projects. I'm unsure why this error is occurring and why modifying the "timings" column prevents it.

It should be noted that this same error occurs with a single data frame loaded from a CSV file, which means that I basically cannot load from CSV if I want to modify the DataFrame afterwards.

Post not yet marked as solved Up vote post of jrappeneker Down vote post of jrappeneker
654 views

Replies

This is most definitely a bug, probably in TabularData itself. I boiled your example down to this:

import Foundation
import TabularData

func test() {
    let csv = """
        c
        CCCCCCCCCCCCCCCC
        """

    var dataFrame = try! DataFrame(
        csvData: Data(csv.utf8),
        columns: ["c"],
        types: ["c": .string]
    )
    dataFrame.removeRow(at: 0)
}

test()

and it crashes in roughly the same way. I’m running this as a command-line tool on macOS 13.4. I tested with both Xcode 14.3 and Xcode 15 beta and it crashes either way. I then put this code into tiny iOS test project and ran it on the iOS 17 beta simulator. It doesn’t crash there, suggesting that the bug has already been fixed.

The really interesting thing is that removing a single character from CCCCCCCCCCCCCCCC makes the problem go away.

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

Okay, thanks for letting me know and boiling it down to the simplest possible example! Hopefully this gets fixed, because the TabularData library is otherwise really nice to work with.