csv: allow csv records with varying lengths, padding with empties

Sometimes trailing empty fields are omitted entirely (including the
commas) in CSV records. (I see this in exported Google spreadsheets.)
Now we don't raise an error in this case, instead we automatically pad
any "short" records with empty fields. Not yet well tested.
This commit is contained in:
Simon Michael 2019-10-02 14:48:51 -10:00
parent 6dcddadd9f
commit eff1b31c86

View File

@ -222,14 +222,17 @@ validateCsv numhdrlines (Right rs) = validate $ drop numhdrlines $ filternulls r
where
filternulls = filter (/=[""])
validate [] = Right []
validate rs@(first:_)
| isJust lessthan2 = let r = fromJust lessthan2 in Left $ printf "CSV record %s has less than two fields" (show r)
| isJust different = let r = fromJust different in Left $ printf "the first CSV record %s has %d fields but %s has %d" (show first) length1 (show r) (length r)
validate rs@(_first:_)
| isJust lessthan2 = let r = fromJust lessthan2 in
Left $ printf "CSV record %s has less than two fields" (show r)
-- | isJust different = let r = fromJust different in
-- Left $ printf "the first CSV record %s has %d fields but %s has %d"
-- (show first) length1 (show r) (length r)
| otherwise = Right rs
where
length1 = length first
lessthan2 = headMay $ filter ((<2).length) rs
different = headMay $ filter ((/=length1).length) rs
-- length1 = length first
-- different = headMay $ filter ((/=length1).length) rs
-- -- | The highest (0-based) field index referenced in the field
-- -- definitions, or -1 if no fields are defined.