skip blank rows in load_csv, closes #29#46
Open
HrachShah wants to merge 1 commit into
Open
Conversation
…KeyError Closes simonw#29. A file ending with \n (or any number of blank trailing lines) used to produce a dict entry that was missing every column, which then crashed with KeyError when the caller tried to access the key column. csv.reader returns an empty list for a fully blank row, so filtering those out before building dicts is enough - blank rows in the middle of the file are skipped the same way. Whitespace-only or comma-only lines still parse as data, which preserves the existing behaviour for inputs where those carry meaning.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #29.
A file ending with
\n(or any number of blank trailing lines) used to produce a dict entry that was missing every column, which then crashed withKeyError: 'a'(or whichever key column was passed) when the caller tried to look up the key. The error message gave no hint that the actual problem was a stray blank line at the end of the file.csv.readerreturns an empty list for a fully blank row (a record of just\n), so filtering those out before building the dicts is enough - blank rows in the middle of the file are skipped the same way. Whitespace-only and comma-only lines still parse as data rows, which preserves the existing behaviour for inputs where those carry meaning (e.g. a row of,is a row with two empty fields, not a blank row).Reproduce
Before:
After:
Tests added
tests/test_csv_diff.py:test_trailing_blank_line_is_skipped: loads a CSV with a single trailing blank line and asserts the key lookup no longer raises and the loaded rows are correct.test_multiple_blank_lines_and_interior_blank_skipped: loads a CSV with several trailing blank lines plus a blank line in the middle, asserts the rows dict is identical to the same data without the blank lines.test_compare_with_trailing_blank_lines: runscompare()against two CSVs that both end in blank lines and asserts the diff result is equivalent to comparing the same data without the blank lines.All 27 tests pass (24 existing + 3 new).